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BREAST SPECIFIC 6EMES AKD PROTEINS 

This invention relates to newly identified 
polynucleotides, polypeptides encoded by such 
polynucleotides, and the use of such polynucleotides and 
polypeptides for detecting disorders of the breast, 
particularly the presence of breast cancer and breast cancer 
metastases • The present invention further relates to 
inhibiting the production cuid function of the polypeptides of 
the present invention. The twenty breast specific genes of 
the present invention are sometimes hereinafter referred to 
as "BSGl", "8862" etc. 

The mammary gland is subject to a variety of disorders 
that should be readily detectcible. Detection may be 
accomplished by inspection which usually consists of 
palpation. Unf ortxinately, so few periodic self -examinations 
are made that many breast masses are discovered only by 
accidental palpation. Aspiration of suspected cysts with a 
fine -gauge needle is another fairly common diagnostic 
practice. Mammography or xeroradiography (soft-tissue x-ray) 
of the breast of yet another. A biopsy of a lesion or 
suspected area is an extreme method of diagnostic test. 

There are many types of tumors and cysts which affect 
the mammary gland. Fibroadenomas is the most common benign 
breast tumor. As a pathological entity, it ranks third 
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behind cystic disease and carcinoma, respectively. These 
tumors are seen most frequently in young people and are 
usually readily recognized because they feel encapsulated. 
Fibrocystic disease, a benign condition, is the most common 
disease of the female breast, occurring in about 20% of pre- 
menopausal women. Lipomas of the breast are also common and 
they are benign in nature. Carcinoma of the breast is the 
most common malignant condition among women and carries with 
it the highest fatality rate of all cancers affecting this 
sex. At some during her life, one of every 15 women in the 
USA will develop cancer of the breast. Its reported annual 
incidence is 70 per 100,000 females in the population in 
1947, rising to 72.5 in 1969 for whites, and rising from 47.8 
to 60.1 for blacks. The annual mortality rate from 1930 to 
the present has remained fairly constant, at approximately 23 
per 100,000 female population. Breast cancer is rare in men, 
but when it does occur, it usually not recognized until late, 
and thus the results of treatment are poor. In women, 
carcinoma of the breast is rarely seen before age 30 and the 
incidence rises rapidly after menopause. For this reason, 
post -menopausal breast masses should be considered cancer 
until proved otherwise. 

In accordance with an aspect of the present invention, 
there are provided nucleic acid probes conqorising nucleic 
acid molecules of sufficient length to specifically hybridize 
to the RNA transcribed from the human breast specific genes 
of the present invention or to DNA corresponding to such RNA. 

in accordance with another aspect of the present 
invention there is provided a method of and products for 
diagnosing breast cancer formation and breast cancer 
metastases by detecting the presence of RNA transcribed from 
the human breast specific genes of the present invention or 
DNA corresponding to such RNA in a sample derived from a 
host. 
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In accordance with yet another aspect of the present 
invention, there is provided a method of and products for 
diagnosing breast cancer formation and breast cancer 
metastases by detecting an altered level of a polypeptide 
corresponding to the breast specific genes of the present 
invention in a sample derived from a host, whereby an 
elevated level of the polypeptide indicates a breast cancer 
diagnosis . 

In accordance with another aspect of the present 
invention, there are provided isolated polynucleotides 
encoding human breast specific polypeptides, including mRNAs, 
DMAs, cDNAs, genomic DNAs, as well as antisense analogs and 
biologically active and diagnostically or therapeutically 
useful fragments thereof. 

In accordance with still another aspect of the present 
invention there are provided himian breast specific genes 
which include polynucleotides as set forth in the sequence 
listing. 

In accordance with a further aspect of the present 
invention, there are provided novel polypeptides encoded by 
the polynucleotides, as well as biologically active and 
diagnostically or therapeutically useful fragments, analogs 
and derivatives thereof. 

In accordance with yet a further aspect of the present 
invention, there is provided a process for producing such 
polypeptides by recombinant techniques comprising culturing 
recombinant prokaryotic and/or eukaryotic host cells, 
containing a polynucleotide of the present invention, under 
conditions promoting expression of said proteins and 
subsequent recovery of said proteins. 

In accordance with yet a further aspect of the present 
invention, there are provided antibodies specific to such 
polypeptides, which may be employed to detect breast cancer 
cells or breast cancer metastasis. 



-3- 



wo 97/02280 



PCTAIS95/08295 



In accordance with another aspect of the present 
invention, there are provided processes for using one or more 
of the polypeptides of the present invention to treat breast 
cancer and for using the polypeptides to screen for compounds 
which interact with the polypeptides, for example, compounds 
which inhibit or activate the polypeptides of the present 
invention. 

In accordance with yet another aspect of the present 
invention, there is provided a screen for detecting compounds 
which inhibit activation of one or more of the 
polynucleotides and/or polypeptides of the present invention 
which may be used to therapeutically, for example, in the 
treatment of breast cancer. 

In accordance with yet a further aspect of the present 
invention, there are provided processes for utilizing such 
polypeptides, or polynucleotides encoding such polypeptides, 
for in vitro purposes related to scientific research, 
synthesis of DNA and manufacture of DNA vectors. 

These and other aspects of the present invention should 
be apparent to those skilled in the art from the teachings 
herein . 

The following drawings are illustrative of embodiments 
of the invention and are not meant to limit the scope of the 
invention as encompassed by the claims. 

Figure 1 is a full length cDNA sequence of breast 
specific gene 1 of the present invention. 

Figure 2 is a partial cDNA sequence and the 
corresponding deduced amino acid sequence of breast specific 
gene 2 of the present invention. 

Figure 3 is a partial cDNA sequence and deduced amino 
acid sequence of breast specific gene 3 of the invention. 

Figure 4 is a partial cDNA sequence and the 
corresponding deduced amino acid sequence of breast specific 
gene 4 of the present invention. 
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Figure 5 is a partial cDNA sequence of breast specific 
gene 5 of the present invention. 

Figure 6 is a partial cDNA and deduced amino acid 
sequence of breast specific gene 6 of the present invention. 

Figure 7 is a partial cDNA sequence of breast specific 
gene 7 of the present invention. 

Figure 8 is a partial cDNA sequence of breast specific 
gene 8 of the present invention. 

Figure 9 is a partial cDNA sequence of breast specific 
gene 9 of the present invention. 

Figure 10 is a partial cDNA sequence of breast specific 
gene 10 of the present invention. 

Figure 11 is a partial cDNA sequence of breast specific 
gene 11 of the present invention. 

Figure 12 is a partial cDNA sequence of breast specific 
gene 12 of the present invention. 

Figure 13 is a partial cDNA sequence of breast specific 
gene 13 of the present invention. 

Figure 14 is a partial cDNA sequence of breast specific 
gene 14 of the present invention. 

Figure 15 is a partial cDNA sequence of breast specific 
gene 15 of the present invention. 

Figure 16 is a partial cDNA sequence of breast specific 
gene 16 of the present invention. 

Figure 17 is a partial cDNA sequence of breast specific 
gene 17 of the present invention. 

Figure 18 is a partial cDNA sequence of breast specific 
gene 18 of the present invention. 

Figure 19 is a partial cDNA sequence of breast specific 
gene 19 of the present invention. 

Figure 20 is a partial cDNA sequence of breast specific 
gene 20 of the present invention. 

The term "breast specific gene" means that such gene is 
primarily expressed in tissues derived from the breast, and 
such genes may be expressed in cells derived from tissues 
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Other than from the breast. However, the expression of such 
genes is significantly higher in tissues derived from the 
breast than from non-breast tissues. 

In accordance with one aspect of the present invention 
there is provided a polynucleotide which encodes the mature 
polypeptides having the deduced amino acid sequence of Figure 
1 (SEQ ID N0:1) and fragments, analogues and derivatives 
thereof . 

In accordance with a further aspect of the present 
invention there is provided a polynucleotide which encodes 
the same mature polypeptide as a human gene having a coding 
portion which contains a polynucleotide which is at least 90% 
identical (preferably at least 95% identical and most 
preferably at least 97% or 100% identical) to one of the 
polynucleotides of Figures 2-20 (SEQ ID NO:2-20) , as well as 
fragments thereof. 

In accordance with still another aspect of the present 
invention there is provided a polynucleotide which encodes 
for the same mature polypeptide as a human gene whose coding 
portion includes a polynucleotide which is at least 90% 
identical to (preferably at least 95% identical to and most 
preferably at least 97% or 100% identical) to one of the 
polynucleotides included in ATCC Deposit No. 97175 deposited 
June 2, 1995. 

In accordance with yet another aspect of the present 
invention, there is provided a polynucleotide probe which 
hybridizes to mRNA (or the corresponding cDNA) which is 
transcribed from the coding portion of a human gene which 
coding portion includes a DNA sequence which is at least 90% 
identical to (preferably at least 95% identical to) and most 
preferably at least 97% or 100% identical) to one of the 
polynucleotide sequences of Figures 1-20 (SEQ ID NO: 1-20) . 

The present invention further relates to a mature 
polypeptide encoded by a coding portion of a human gene which 
coding portion includes a DNA sequence which is at lest 90% 
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identical to (preferably at least 95% identical to and more 
preferably 97% or 100% identical to) one of the 
polynucleotides of Figures 2-20 (SEQ ID NO:2-20), as well as 
analogues, derivatives and fragments of such polypeptides. 

The present invention also relates to one of the mature 
polypeptides of Figure 1 (SEQ ID N0:1) and fragments, 
analogues and derivatives of such polypeptides. 

The present invention further relates to the same mature 
polypeptide encoded by a human gene whose coding portion 
includes DNA which is at least 90% identical to (preferably 
at least 95% identical to and more preferably at least 97% or 
100% identical to) one of the polynucleotides included in 
ATCC Deposit No. 97175 deposited June 2, 1995. 

In accordance with an aspect of the present invention, 
there are provided isolated nucleic acids (polynucleotides) 
which encode for the mature polypeptides having the deduced 
amino acid sequence of Figure 1 (SEQ ID N0:1) or fragments, 
analogues or derivatives thereof. 

The polynucleotides of the present invention may be in 
the form of RNA or in the form of DNA, which' DNA includes 
cDNA, genomic DNA, and synthetic DNA. The DNA may be double- 
stranded or single-stranded, and if single stranded may be 
the coding strand or non-coding (anti-sense) strand. The 
coding sequence which encodes the mature polypeptide may 
include DNA identical to Figures 1-20 (SEQ ID NO: 1-20) or 
that of the deposited clone or may be a different coding 
sequence which coding sequence, as a result of the redundancy 
or degeneracy of the genetic code, encodes the same mature 
polypeptide as the coding sequence of a gene which coding 
sequence includes the DNA of Figures 1-20 (SEQ ID NO: 1-20) or 
the deposited cDNA. 

The polynucleotide which encodes a mature polypeptide of 
the present invention may include, but is not limited to: 
only the coding sequence for the mature polypeptide; the 
coding sequence for the mature polypeptide and additional 
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coding sequence such as a leader or secretory sequence or a 
proprotein sequence; the coding sequence for the mature 
polypeptide (and optionally additional coding sequence) and 
non-coding sequence, such as introns or non-coding sequence 
5' and/or 3' of the coding sequence for the mature 
polypeptide. 

Thus, the term "polynucleotide encoding a polypeptide" 
encompasses a polynucleotide which includes only coding 
sequence for the polypeptide as well as a polynucleotide 
which includes additional coding and/or non- coding sequence. 

The present invention further relates to variants of the 
hereinabove described polynucleotides which encode fragments, 
analogs and derivatives of a mature polypeptide of the 
present invention. The variant of the polynucleotide may be 
a naturally occurring allelic variant of the polynucleotide 
or a non-naturally occurring variant of the polynucleotide. 

Thus, the present invention includes polynucleotides 
encoding the same mature polypeptide as hereinabove described 
as well as variants of such polynucleotides which variants 
encode a fragment, derivative or analog of a polypeptide of 
the invention. Such nucleotide variants include deletion 
variants, substitution variants and addition or insertion 
variants . 

The polynucleotides of the invention may have a coding 
sequence which is a naturally occurring allelic variant of 
the human gene whose coding sequence includes DNA as shown in 
Figures 1-20 (SEQ ID NO: 1-20) or of the coding sequence of 
the DNA in the deposited clone. As known in the art, an 
allelic variant is an alternate form of a polynucleotide 
sequence which may have a substitution, deletion or addition 
of one or more nucleotides, which does not substantially 
alter the fxanction of the encoded polypeptide. 

The present invention also includes poljoiucleotides , 
wherein the coding sequence for the mature polypeptide may be 
fused in the same reading frame to a polynucleotide sequence 
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which aids in expression and secretion of a polypeptide from 
a host cell, for example, a leader sequence which fimctions 
as a secretory sequence for controlling transport of a 
polypeptide from the cell. The polypeptide having a leader 
sequence is a preprotein and may have the leader sequence 
cleaved by the host cell to form the mature form of the 
polypeptide. The polynucleotides may also encode a 
proprotein which is the mature protein plus additional 5' 
amino acid residues. A mature protein having a prosequence 
is a proprotein cuid is an inactive form of the protein. Once 
the prosequence is cleaved an active mature protein remains* 

Thus, for example, the polynucleotide of the present 
invention may encode a mature protein, or a protein having a 
prosequence or a protein having both a presequence and a 
presequence (leader sequence) . 

The polynucleotides of the present invention may also 
have the coding secpience fused in frame to a marker sequence 
which allows for purification of the polypeptide of the 
present invention. The marker sequence may be a hexa- 
histidine tag supplied by a pQE-9 vector to provide for 
purification of the mature polypeptide fused to the marker in 
the case of a bacterial host, or, for example, the marker 
sequence may be a hemagglutinin (HA) tag when a mammalian 
host, e.g. COS -7 cells, is used. The HA tag corresponds to 
an epitope derived from the influenza hemagglutinin protein 
(Wilson, I., et al.. Cell, 37:767 (1984)). 

The present invention further relates to 
polynucleotides which hybridize to the hereinabove -de scribed 
polynucleotides if there is at least 70%, preferably at least 
90%, and more preferably at least 95% identity between the 
sequences. The present invention particularly relates to 
polynucleotides which hybridize under stringent conditions to 
the hereinabove-described polynucleotides. As herein used, 
the term "stringent conditions" means hybridization will 
occur only if there is at least 95% and preferably at least 
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97% identity between the sequences. The polynucleotides 
which hybridize to the hereinabove described polynucleotides 
in a preferred embodiment encode polypeptides which retain 
substantially the same biological function or activity as the 
mature polypeptide of the present invention encoded by a 
coding sequence which includes the DNA of Figures 1-20 (SBQ 
ID NO:1-20) or the deposited cDNA(s) . 

Alternatively, the polynucleotide may have at least 10 
or 20 bases, preferably at least 30 bases, and more 
preferably at least 50 bases which hybridize to a 
polynucleotide of the present invention and which has cui 
identity thereto, as hereinabove described, and which may or 
may not retain activity. For example, such polynucleotides 
may be employed as probes for polynucleotides, for exanple, 
for recovery of the polynucleotide or as a diagnostic probe 
or as a PGR primer. 

Thus, the present invention is directed to 
polynucleotides having at least a 70% identity, preferably at 
least 90% and more preferably at least 95% identity to a 
polynucleotide which encodes the mature polypeptide encoded 
by a hiaman gene which includes the DNA of one of Figures 1-20 
(SBQ ID NO: 1-20) as well as fragments thereof, which 
fragments have at least 30 bases and preferably at least 50 
bases and to polypeptides encoded by such polynucleotides. 

The partial sequences are specific tags for messenger 
RNA molecules. The complete sequence of that messenger RNA, 
in the form of cDNA, can be determined using the partial 
sequence as a probe to identify a cDNA clone corresponding to 
a full-length transcript, followed by sequencing of that 
clone. The partial cDNA clone can also be used as a probe to 
identify a genomic clone or clones that contain the conplete 
gene including regulatory and promoter regions, exons, and 
introns . 

The partial sequences of Figures 2-20 (SEQ ID NO: 2-20) 
may be used to identify the corresponding full length gene 
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from which they were derived. The partial sequences can be 
nick-translated or end-labelled with ^^P using polynucleotide 
kinase using labelling methods known to those with skill in 
the art (Basic Methods in Molecular Biology, L.G. Davis, M.D. 
Dibner, and J.F. Battey, ed. , Elsevier Press, NY, 1986). A 
lambda library prepared from human breast tissue CeUi be 
directly screened with the labelled sequences of interest or 
the library can be converted en masse to pBluescript 
(Stratagene Cloning Systems, La Jolla, CA 92037) to 
facilitate bacterial breasty screening. Regarding 
pBluescript, see Sambrook et al.. Molecular Cloning-A 
Laboratory Manual, Cold Spring Harbor Laboratory Press 
(1989), pg. 1.20. Both methods are well known in the art. 
Briefly, filters with bacterial colonies containing the 
library in pBluescript or bacterial lawns containing lambda 
plaques are denatured and the DNA is fixed to the filters. 
The filters are hybridized with the labelled probe using 
hybridization conditions described by Davis et al., supra . 
The partial sequences, cloned into lambda or pBluescript, can 
be used as positive controls to assess background binding and 
to adjust the hybridization and washing stringencies 
necessary for accurate clone identification. The resulting 
autoradiograms are compared to duplicate plates of colonies 
or plaques; each exposed spot corresponds to a positive 
breasty or plaque. The colonies or plaques are selected, 
expanded and the DNA is isolated from the colonies for 
further analysis and sequencing. 

Positive cDNA clones are analyzed to determine the 
amount of additional sequence they contain using PCR with one 
primer from the partial sequence and the other primer from 
the vector. Clones with a larger vector- insert PCR product 
than the original partial sequence are analyzed by 
restriction digestion and DNA sequencing to determine whether 
they contain an insert of the same size or similar as the 
mRNA size determined from Northern blot Analysis. 
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Once one or more overlapping cDNA clones are identified, 
the coii:5)lete sequence of the clones can be determined. The 
preferred method is to use exonuclease III digestion 
(McCombie, W.R, Kirkness, E., Fleming, J.T., Kerlavage, A.R., 
lovannisci, D.M*, and Martin-Gallardo, R. , Methods, 3:33-40, 
1991) . A series of deletion clones are generated, each of 
which is sequenced* The resulting overlapping sequences are 
assembled into a single contiguous sequence of high 
redundancy (usually three to five overlapping sequences at 
each nucleotide position) , resulting in a highly accurate 
final sequence. 

The DNA sequences (as well as the corresponding RNA 
sequences) also include sequences which are or contain a DNA 
sequence identical to one contained in and isolatable from 
ATCC Deposit No. 97175, deposited Jime 2, 1995, and fragments 
or portions of the isolated DNA sequences (and corresponding 
RNA sequences) , as well as DNA (RNA) sequences encoding the 
same polypeptide. 

The deposit (s) referred to herein will be maintained 
under the terms of the Budapest Treaty on the International 
Recognition of the Deposit of Micro-organisms for purposes of 
Patent Procedure. These deposits are provided merely as 
convenience to those of skill in the art and are not sui 
admission that a deposit is required under 35 U.S.C. §112. 
The sequence of the polynucleotides contained in the 
deposited materials, as well as the amino acid sequence of 
the polypeptides encoded thereby, are incorporated herein by 
reference and are controlling in the event of any conflict 
with any description of sequences herein. A license may be 
required to make, use or sell the deposited materials, and 
no such license is hereby granted. 

The present invention f xirther relates to polynucleotides 
which have at least 10 bases, preferably at least 20 bases, 
and may have 30 or more bases, which polynucleotides are 
hybridizable to and have at least a 70% identity to RNA (and 
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DNA which corresponds to such RNA) transcribed from a hximan 
gene whose coding portion includes DNA as hereinabove 
described. 

Thus, the polynucleotide sequences which hybridize as 
described above may be used to hybridize to and detect the 
expression of the human genes to which they correspond for 
use in diagnostic assays as hereinafter described. 

In accordance with still another aspect of the present 
invention there are provided diagnostic assays for detecting 
micrometastases of breast cancer in a host. While applicant 
does not wish to limit the reasoning of the present invention 
to any specific scientific theory, it is believed that the 
presence of active transcription of a breast specific gene of 
the present invention in cells of the host, other than those 
derived from the breast, is indicative of breast cancer 
metastases. This is true because, while the breast specific 
genes are found in all cells of the body, their transcription 
to mRNA, cDNA cind expression products is primarily limited to 
the breast in non-diseased individuals. However, if breast 
cancer is present, breast cancer cells migrate from the 
cancer to other cells, such that these other cells are now 
actively transcribing and e3q)ressing a breast specific gene 
at a greater level tham is normally found in non-diseased 
individuals , i.e., transcription is higher than found in non- 
breast tissues in healthy individuals. It is the detection 
of this enhanced transcription or enhanced protein expression 
in cells, other than those derived from the breast, which is 
indicative of metastases of breast cancer. 

In one example of such a diagnostic assay, an RNA 
sequence in a sample derived from a tissue other than the 
breast is detected by hybridization to a probe. The sample 
contains a nucleic acid or a mixture of nucleic acids, at 
least one of which is suspected of containing a human breast 
specific gene or fragment thereof of the present invention 
which is transcribed and expressed in such tissue. Thus, for 
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exan5>le, in a form of an assay for determining the presence 
of a specific RNA in cells, initially RNA is isolated from 
the cells. 

A saTt5)le may be obtained from cells derived from tissue 
other than from the breast including but not limited to 
blood, urine, saliva, tissue biopsy and autopsy material. 
The use of such methods for detecting enhanced transcription 
to mRNA from a human breast specific gene of the present 
invention or fragment thereof in a sample obtained from cells 
derived from other than the breast is well within the scope 
of those skilled in the art from the teachings herein. 

The isolation of mRNA comprises isolating total cellular 
RNA by disrupting a cell cuid performing differential 
centrifugation. Once the total RNA is isolated, mRNA is 
isolated by making use of the adenine nucleotide residues 
known to those skilled in the art as a poly (A) tail found on 
virtually every eukaryotic mRNA molecule at the 3' end 
thereof. Oligonucleotides composed of only deoxythymidine 
[oligo(dT)] are linked to cellulose and the oligo(dT)- 
cellulose packed into small columns. When a preparation of 
total cellular RNA is passed through such a column, the mRNA 
molecules bind to the oligo(dT) by the poly (A) tails while the 
rest of the RNA flows through the column. The bound mRNAs 
are then eluted from the column and collected. 

One example of detecting isolated mRNA transcribed from 
a breast specific gene of the present invention comprises 
screening the collected mRNAs with the gene specific 
oligonucleotide probes, as hereinabove described. 

It is also appreciated that such probes can be and are 
preferably labeled with an analytically detectable reagent to 
facilitate identification of the probe. Useful reagents 
include but are not limited to radioactivity, fluorescent 
dyes or enzymes capable of catalyzing the fcgnnation of a 
detectable product. 
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An example of detecting a polynucleotide complementary 
to the mRNA sequence (cDNA) utilizes the polymerase chain 
reaction (PGR) in conjunction with reverse transcriptase. 
PGR is a very powerful method for the specific amplification 
of DNA or RNA stretches (Saiki et al., Nature, 234:163-166 
(1986)). One application of this technology is in nucleic 
acid probe technology to bring up nucleic acid sequences 
present in low copy numbers to a detectable level. Numerous 
diagnostic and scientific applications of this method have 
been described by H.A. Erlich (ed.) in PCR Technology- 
Principles and Applications for DNA Anplif ication, Stockton 
Press, USA, 1989, and by M.A. Inis (ed.) in PGR Protocols, 
Academic Press, San Diego, USA, 1990. 

RT-PC31 is a combination of PCR with the reverse 
transcriptase enzyme. Reverse transcriptase is an enzyme 
which produces cDNA molecules from corresponding mRNA 
molecules. This is important since PC31 an?)lifies nucleic 
acid molecules, particularly DNA, and this DNA may be 
produced from the mRNA isolated from a sample derived from 
the host. 

A specific example of an RT-PC3^ diagnostic assay 
involves removing a sanple from a tissue of a host. Such a 
sample will be from a tissue, other than the breast, for 
example, blood. Therefore, an example of such a diagnostic 
assay comprises whole blood gradient isolation of nucleated 
cells, total RNA extraction, RT-PC31 of total RNA and agarose 
gel electrophoresis of PCR products. The PCR products 
comprise cDNA complementary to RNA transcribed from one or 
more breast specific genes of the present invention or 
fragments thereof. More particularly, a blood sample is 
obtained and the whole blood is combined with an equal volume 
of phosphate buffered saline, centrifuged and the lyirphocyte 
and granulocyte layer is carefully aspirated and rediluted in 
phosphate buffered saline and centrifuged again. The 
supemate is discarded and the pellet containing nucleated 
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cells is used for RNA extraction using the RNazole B method 
as described by the manufacturer (Tel -Test Inc., Friendswood, 
TX) . 

Oligonucleotide primers and probes are prepared with 
high specificity to the DNA sequences of the present 
invention. The probes are at least 10 base pairs in length, 
preferably at least 30 base pairs in length and most 
preferably at least 50 base pairs in length or more. The 
reverse transcriptase reaction and PCR amplification are 
performed sequentially without intem:5)tion . Taq polymerase 
is used during PCR and the T?CR products are concentrated and 
the entire sample is run on a Tris -borate -EDTA agarose gel 
containing ethidium bromide. 

In accordance with another aspect of the present 
invention, there is provided a method of diagnosing a 
disorder of the breast, for example breast cancer, by 
determining altered levels of the breast specific 
polypeptides of the present invention in a biological sample, 
derived from tissue other than trcm the breast. Elevated 
levels of the breast specific polypeptides of the present 
invention, indicates active transcription and expression of 
the corresponding breast specific gene product. Assays used 
to detect levels of a breast specific gene polypeptide in a 
sample derived from a host are well-known to those skilled in 
the art and include radioimmunoassays, competitive-binding 
assays, Western blot analysis, ELISA assays and "sandwich" 
assays. A biological sait5)le may include, but is not limited 
to, tissue extracts, cell samples or biological fluids, 
however, in accordance with the present invention, a 
biological sample specifically does not include tissue or 
cells of the breast. 

An ELISA assay (Coligan, et al.. Current Protocols in 
Immunology , 1(2), Chapter 6, 1991) initially comprises 
preparing an antibody specific to a breast specific 
polypeptide of the present invention, preferably a monoclonal 
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antibody. In addition, a reporter antibody is prepared 
against the monoclonal antibody. To the reporter antibody is 
attached a detectable reagent such as radioactivity, 
fluorescence or, in this example, a horseradish peroxidase 
enzyme. A sample is removed from a host and incubated on a 
solid support, e.g., a polystyrene dish, that binds the 
proteins in the sample. Any free protein binding sites on 
the dish are then covered by incubating with a non-specific 
protein, such as BSA. Next, the monoclonal antibody is 
incubated in the dish during which time the monoclonal 
antibodies attach to the breast specific polypeptide attached 
to the polystyrene dish. All unbound monoclonal antibody is 
washed out with buffer. The reporter antibody linked to 
horseradish peroxidase is now placed in the dish resulting in 
binding of the reporter antibody to any monoclonal antibody 
bound to the breast specific gene polypeptide. Unattached 
reporter antibody is then washed out. Peroxidase substrates 
are then added to the dish and the amount of color developed 
in a given time period is a measurement of the amount of the 
breast specific polypeptide present in a given volume of 
patient san^le when compared against a standard curve. 

A competition assay may be employed where antibodies 
specific to a breast specific polypeptide are attached to a 
solid support. The breast specific polypeptide is then 
labeled cuid the labeled polypeptide a sample derived from the 
host are passed over the solid support and the amount of 
label detected, for example, by liquid scintillation 
chromatography, can be correlated to a quantity of the breast 
specific polypeptide in the sample. 

A "sandwich" assay is similar to an ELISA assay. In a 
"sandwich" assay, breast specific polypeptides are passed 
over a solid support and bind to antibody attached to the 
solid support. A second antibody is then boiind to the breast 
specific polypeptide. A third antibody which is labeled and 
is specific to the second antibody, is then passed over the 
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solid suppprt and binds to the second antibody and an amoxint 
can then be quantified. 

In alternative methods, labeled antibodies to a breast 
specific polypeptide are used. In a one -step assay, the 
target molecule, if it is present, is immobilized and 
incxibated with a labeled antibody. The labeled antibody 
binds to the immobilized target molecule. After washing to 
remove the unbound molecules, the sample is assayed for the 
presence of the label. In a two-step assay, immobilized 
target molecule is incubated with an unlabeled antibody. The 
target molecule -labeled antibody complex, if present, is then 
bound to a second, labeled antibody that is specific for the 
unlabeled antibody. The sample is washed and assayed for the 
presence of the label. 

Such antibodies specific to breast specific gene 
proteins, for example, ant i- idiotypic antibodies, can be used 
to detect breast cancer cells by being labeled amd described 
above and binding tightly to the breast cancer cells, and, 
therefore, detect their presence. 

The antibodies may also be used to target breast cancer 
cells, for example, in a method of homing interaction agents 
which, when contacting breast cancer cells, destroy them. 
This is true since the antibodies are specific for breast 
specific genes which are primarily expressed in breast 
cancer, and a linking of the interaction agent to the 
antibody would cause the interaction agent to be carried 
directly to the breast. 

Antibodies of this type may also be used to do in vivo 
imaging, for example, by labeling the antibodies to 
facilitate scanning of the breast. One method for imaging 
comprises contacting any cancer cells of the breast to be 
imaged with an ant i -breast specific gene protein antibody 
labeled with a detectable marker. The method is performed 
under conditions such that the labeled antibody binds to the 
breast specif ic; g proteins, in a specific example, the 
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antibodies interact with the breast, for example, breast 
Ccuicer cells, and fluoresce upon such contact such that 
imaging and visibility of the breast is enhanced to allow a 
determination of the diseased or non-diseased state of the 
breast . 

The choice of marker used to label the antibodies will 
vary depending upon the application. However, the choice of 
marker is readily determinable to one skilled in the art. 
These labeled antibodies may be used in immunoassays as well 
as in histological applications to detect the presence of the 
proteins. The labeled antibodies may be polyclonal or 
monoclonal . 

The presence of active transcription, which is greater 
than that normally foxind, of the breast specific genes in 
cells other than from the breast, by the presence of an 
altered level of mRNA, cDNA or expression products is an 
important indication of the presence of a breast cancer which 
has metastasized, since breast cancer cells are migrating 
from the breast into the general circulation. Accordingly, 
this phenomenon may have important clinical implications 
since the method of treating a localized, as opposed to a 
metastasized, tumor is entirely different. 

Of the 20 breast specific genes disclosed, only breast 
specific gene l is a full-length gene. Breast specific gene 
1 is 79% identical and 83% similar to human Alzheimer disease 
amyloid gene. Breast specific gene 2 is 30% identical and 
48% similar to human hydroxyindole-o-methyltransf erase gene. 
Breast specific gene 3 is 58% identical and 62% similar to 
human 06-methylguanine-DNA methyltransf erase gene. Breast 
specific gene 4 is 34% identical and 65% similar to the mouse 
pl20 gene. Breast specific gene 5 is 78% identical and 89% 
similar to human p70 ribosomal S6 kinase alpha-II gene. 
Breast specific gene 6 is 77% identical and 79% similar to 
the human transcription factor NFATp gene. 
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As Stated previously, the breast specific genes of the 
present invention are putative molecular markers in the 
diagnosis of breast camcer formation, and breast cancer 
metastases. As shown in the following Table 1, the presence 
of the breast specific genes when tested in normal breast, 
breast cancer, embryo cind other Ccincer libraries, the breast 
specific genes of the present invention were foxmd to be most 
prevalent in the breast cancer library, indicating that the 
genes of the present invention may be en5)loyed for detecting 
breast cancer, as discussed previously. The table also 
indicates a putative identification, based on homology, of 
BS61 through BSG6 to known genes. 



Table 1 



Genes 


nomoxo^ uene 
Name (Class) 


Br 


OA wo 


Embrvo 


Other 
Can- 
cers 


Others 


BS61 


AD Amyloid (3) 


1 


6 






1 


BSG2 


Hydroxyindole- 
o-methytrans- 
f erase (2) 




3 


1 


1 




BSG3 


0-6- 

met hylguanine - 
DNA 

methyl trans- 
ferase (1) 




3 


1 


1 




BS64 


P120 (3) 




3 




1 




BSG5 


p70 ribosomal 
S6 kinase 
alpha-II (2) 




3 




1 




BSG6 


Transcription 
factor NFATp{3) 


2 










BSG7 






2 


1 






BSGe 






4 




3 


1 


BSG8 






2 








BSG9 






3 








BSGIO 






3 








BS611 






3 








BSG12 




3 


3 








BSG13 






3 
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BSG14 






2 








BSG15 




3 










BSG16 




1 


1 


1 






BSG17 






2 




1 




BSG18 




2 










BSG19 




1 


1 








BSG20 






2 









The assays described above may also be used to test 
whether bone marrow preserved before chemotherapy is 
contaminated with micrometastases of a breast cancer cell. 
In the assay, blood cells from the bone marrow are isolated 
and treated as described above, this method allows one to 
determine whether preserved bone marrow is still suitable for 
transplantation after chemotherapy. 

The present invention further relates to mature 
polypeptides, for example the BSGl polypeptide, as well as 
fragments, analogs and derivatives of such polypeptide. 

The terms "fragment," "derivative" and "ainalog" when 
referring to the polypeptides encoded by the genes of the 
invention means a polypeptide which retains essentially the 
same biological function or activity as such polypeptide. 
Thus, an analog includes a proprotein which can be activated 
by cleavage of the proprotein portion to produce an active 
mature polypeptide . 

The polypeptides of the present invention may be 
recombinant polypeptides, natural polypeptides or synthetic 
polypeptides, preferably recombinant polypeptides. 

The fragment, derivative or analog of the polypeptides 
encoded by the genes of the invention may be (i) one in which 
one or more of the amino acid residues are substituted with 
a conserved or non-conserved amino acid residue (preferably 
a conserved amino acid residue) and such substituted amino 
acid residue may or may not be one encoded by the genetic 
code, or (ii) one in which one or more of the amino acid 
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residues includes a substituent groxq), or (iii) one in which 
the polypeptide is fused with another compound, such as a 
compound to increase the half-life of the polypeptide (for 
example, polyethylene glycol), or (iv) one in which the 
additional amino acids are fused to the polypeptide, such as 
a leader or secretory sequence or a sequence which is 
eitployed for purification of the mature polypeptide or a 
proprotein sequence. Such fragments, derivatives and analogs 
are deemed to be within the scope of those skilled in the art 
from the teachings herein. 

The polypeptides and polynucleotides of the present 
invention are preferably provided in an isolated form, and 
preferably are purified to homogeneity. 

The term "isolated" means that the material is removed 
from its original environment (e.g., the natural environment 
if it is naturally occurring) . For example, a naturally- 
occurring polynucleotide or polypeptide present in a living 
animal is not isolated, but the same polynucleotide or 
polypeptide, separated from some or all of the coexisting 
materials in the natural system, is isolated. Such 
polynucleotides could be part of a vector and/or such 
polynucleotides or polypeptides could be part of a 
composition, and still be isolated in that such vector or 
composition is not part of its natural enviroimient . 

The polypeptides of the present invention include the 
polypeptides of Figure 1 (SEQ ID N0:1) (in particular the 
mature polypeptides) as well as polypeptides which have at 
least 70% similarity (preferably at least a 70% identity) to 
the polypeptides of Figure l (SEQ ID N0:1) and more 
preferably at least a 90% similarity (more preferably at 
least a 90% identity) to the polypeptides of Figures 8 and 9 
(SEQ ID NO: 8 and 9) and still more preferably at least a 95% 
similarity (still more preferably at least 95% identity) to 
the polypeptides of Figure 1 (SEQ ID NO:l) and also include 
portions of such polypeptides with such portion of the 
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polypeptide generally containing at least 30 amino acids and 
more preferably at least 50 amino acids. 

As known in the art "similarity" between two 
polypeptides is determined by comparing the amino acid 
sequence and its conserved amino acid siibstitutes of one 
polypeptide to the sequence of a second polypeptide. 

Fragments or portions of the polypeptides of the present 
invention may be enployed for producing the corresponding 
full-length polypeptide by peptide synthesis; therefore, the 
fragments may be employed as intermediates for producing the 
full-length polypeptides. Fragments or portions of the 
polynucleotides of the present invention may be used to 
synthesize full-length polynucleotides of the present 
invention. 

The present invention also relates to vectors which 
include polynucleotides of the present invention, host cells 
which are genetically engineered with vectors of the 
invention and the production of polypeptides of the invention 
by recombinant techniques. 

Host cells are genetically engineered (transduced or 
transformed or transfected) with the vectors of this 
invention which may be, for example, a cloning vector or an 
expression vector. The vector may be, for example, in the 
form of a plasmid, a viral particle, a phage, etc. The 
engineered host cells can be cultured in conventional 
nutrient media modified as appropriate for activating 
promoters, selecting transformants or airqplifying the breast 
specific genes. The culture conditions, such as temperature, 
pH and the like, are those previously used with the host cell 
selected for expression, and will be apparent to those of 
ordinarily skill in the art. 

The polynucleotides of the present invention may be 
enployed for producing polypeptides by recombinant 
techniques. Thus, for example, the polynucleotide may be 
included in any one of a variety of expression vectors for 
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expressing a polypeptide. Such vectors include chromosomal, 
. nonchromosomal and synthetic DNA sequences , e.g., 
derivatives of SV40; bacterial plasmids; phage DNA; 
baculovirus; yeast plasmids; vectors derived from 
combinations of plasmids and phage DNA, viral DNA such as 
vaccinia, adenovirus, fowl pox virus, and pseudorabies . 
However, any other vector may be used as long as it is 
repliCcdDle and viable in the host. 

The appropriate DNA sequence may be inserted into the 
vector by a variety of procedures. In general, the DNA 
sequence is inserted into an appropriate restriction 
endonuclease site(s) by procedures known in the art. Such 
procedures and others are deemed to be within the scope of 
those skilled in the art. 

The DNA sequence in the expression vector is operatively 
linked to an appropriate expression control sequence (s) 
(promoter) to direct mRNA synthesis. As representative 
examples of such promoters, there may be mentioned: LTR or 
SV40 promoter, the E. coli. lac or trp . the phage lambda Pl 
promoter and other promoters known to control expression of 
genes in prokaryotic or eukaryotic cells or their virxises. 
The egression vector also contains a ribosome binding site 
for translation initiation and a transcription terminator. 
The vector may also include appropriate sequences for 
amplifying expression. 

In addition, the e3q)ression vectors preferably contain 
one or more selectable marker genes to provide a phenotypic 
trait for selection of transformed host cells such as 
dihydrof olate reductase or neomycin resistance for eukaryotic 
cell culture, or such as tetracycline or ampicillin 
resistance in E. coli . 

The vector containing the appropriate DNA sequence as 
hereinabove described, as well as an appropriate promoter or 
control sequence, may be employed to transform an appropriate 
host to permit the host to express the protein. 
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As representative examples of appropriate hosts, there 
may be mentioned: bacterial cells, such as E^^ — coli, 
stireptomvces , Salmonella tvphimurium ; fungal cells, such as 
yeast; insect cells such as Drosophila S2 and Spodoptera Sf9; 
animal cells such as CHO, COS or Bowes melanoma ; 
adenoviruses; plant cells, etc. The selection of an 
appropriate host is deemed to be within the scope of those 
skilled in the art from the teachings herein. 

More particularly, the present invention also includes 
recombinant constructs comprising one or more of the 
sequences as broadly described above. The constructs 
comprise a vector, such as a plasmid or viral vector, into 
which a sequence of the invention has been inserted, in a 
forward or reverse orientation. In a preferred aspect of 
this embodiment, the construct further comprises regulatory 
sequences, including, for example, a promoter, operably 
linked to the sequence. Large numbers of suitable vectors 
and promoters are known to those of skill in the art, and are 
commercially available. The following vectors are provided 
by way of example. Bacterial: pQE70, pQE60, pQE-9 (Qiagen) , 
pBS, pDlO, phagescript, psiXl74, pbluescript SK, pBSKS, 
pNHBA, pNHl6a, pNHlBA, pNH46A (Stratagene) ; ptrc99a, pKK223- 
3, PKK233-3, pDR540, pRITS (Pharmacia). Eukaryotic: pWLNEO, 
PSV2CAT, pOG44, pXTl, pSG (Stratagene) pSVK3, pBPV, pMSG, 
pSVL (Pharmacia) . However, any other plasmid or vector may 
be used as long as they are replicable and viable in the 
host. 

Promoter regions can be selected from any desired gene 
using CAT (chloran5)henicol transferase) vectors or other 
vectors with selectable markers. Two appropriate vectors are 
pKK232-8 and pCM7. Particular named bacterial promoters 
include lad, lacZ, T3, T7, gpt, lambda Pr, Pl and trp. 
Eukaryotic promoters include OW immediate early, HSV 
thymidine kinase, early and late SV40, LTRs from retrovirus, 
and mouse metallothionein-I . Selection of the appropriate 
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vector and promoter is well within the level of ordinary 

skill in the art. 

In a further embodiment, the present invention relates 
to host cells containing the above -described constructs. The 
host cell can be a higher eukaryotic cell, such as a 
mammalian cell, or a lower eukaryotic cell, such as a yeast 
cell, or the host cell can be a prokaryotic cell, such as a 
bacterial cell. Introduction of the construct into the host 
cell can be effected by calcium phosphate transfection, DBAB- 
Dextran mediated transfection, or electroporation (Davis, L. . 
Dibner, M. , Battey, I., Basic Methods in Molecular Biology, 
(1986)) . 

The constructs in host cells can be used in a 
conventional manner to produce the gene product encoded by 
the recombinant sequence. Alternatively, the polypeptides of 
the invention can be synthetically produced by conventional 
peptide synthesizers. 

Proteins can be expressed in mammalian cells, yeast, 
bacteria, or other cells under the control of appriapriate 
promoters. Cell -free translation systems can also be 
employed to produce such proteins using RNAs derived from the 
DNA constructs of the present invention. Appropriate cloning 
and expression vectors for use with prokaryotic and 
eukaryotic hosts are described by Sambrook, et al.. Molecular 
Cloning: A Laboratory Manual. Second Edition. Cold Spring 
Harbor. N.Y., (1989). the disclosure of which is hereby 
incorporated by reference. 

Transcription of the DNA encoding the polypeptides of 
the present invention by higher eukaryotes is increased by 
inserting an enhancer sequence into the vector. Enhancers 
are cis-acting elements of DNA, usually about from 10 to 300 
bp that act on a promoter to increase its transcription. 
Examples including the SV40 enhancer on the lat;e side of the 
replication origin bp 100 to 270, a cytomegalovirus early 
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promoter enhancer, the polyoma enhancer on the late side of 
the replication origin, and adenovirus enhancers. 

Generally, recombinant expression vectors will include 
origins of replication and selectable markers permitting 
transformation of the host cell, e.g., the aitqpicillin 
resistance gene of E. coli and s. cerevisiae TRPl gene, and 
a promoter derived from a highly- expressed gene to direct 
transcription of a downstream structural sequence. Such 
promoters can be derived from qperons encoding glycolytic 
enzymes such as 3 -phosphoglycerate kinase (PGR), or-f actor, 
acid phosphatase, or heat shock proteins, among others. The 
heterologous structural sequence is assembled in appropriate 
phase with translation initiation and termination sequences. 
Optionally, the heterologous sequence can encode a fusion 
protein including an N-terminal identification peptide 
inqparting desired characteristics, e.g., stabilization or 
sin5)lified purification of expressed recombinant product. 

Useful expression vectors for bacterial use are 
constructed by inserting a structural DNA sequence encoding 
a desired protein together with suitable translation 
initiation and termination signals in operable reading frame 
with a functional promoter. The vector will comprise one or 
more phenotypic selectable markers and an origin of 
replication to ensure maintenance of the vector and to, if 
desirable, provide amplification within the host. Suitable 

prokaryotic hosts for transformation include E^; coli, 

Bacillus subtilis . Salmonella typhimurium and various species 
within the genera Pseudomonas, Streptomyces, and 
Staphylococcus, although others may also be en5)loyed as a 
matter of choice. 

As a representative but nonlimiting example, useful 
expression vectors for bacterial use can comprise a 
selectable marker and bacterial origin of replication derived 
from commercially available plasmids comprising genetic 
elements of the well known cloning vector pBR322 (ATCC 
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37017) . Such commercial vectors include, for example, 
pKK223-3 (Pharmacia Fine Chemicals, Uppsala, Sweden) and GEMl 
(Promega Biotec, Madison, WI, USA). These pBR322 "backbone" 
sections are combined with an appropriate promoter and the 
structural sequence to be expressed. 

Following transformation of a suitable host strain cuid 
growth of the host strain to an appropriate cell density, the 
selected promoter is induced by appropriate means (e.g., 
temperature shift or chemical induction) and cells are 
cultured for an additional period. 

Cells are typically harvested by centrifugation, 
disrupted by physical or chemical means, and the resulting 
crude extract retained for further purification. 

Microbial cells employed in expression of proteins can 
be disrupted by any convenient method, including freeze-thaw 
cycling, sonication, mechanical disruption, or use of cell 
lysing agents, such methods are well know to those skilled in 
the art. 

Various mammalian cell culture systems can also be 
employed to express recombinant protein. Examples of 
mammalian expression systems include the COS -7 lines of 
monkey kidney fibroblasts, described by Gluzman, Cell, 23:175 
(1981) , and other cell lines capable of expressing a 
compatible vector, for example, the C127, 3T3, CHO, HeLa and 
BHK cell lines. Mammalian expression vectors will conqprise 
an origin of replication, a suitable promoter and enhamcer, 
and also any necessary ribosome binding sites, 
polyadenylation site, splice donor and acceptor sites, 
transcriptional termination sequences, aind 5' flanking 
nontranscribed sequences. DNA sequences derived from the 
SV40 splice, and polyadenylation sites may be used to provide 
the required nontranscribed genetic elements. 

The breast specific gene polypeptides can be recovered 
and purified from recombinant cell cultures by methods 
including ammonium sulfate or ethanol precipitation, acid 
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extraction, cuiion or cation exchcinge chromatography, 
phosphocellulose chromatography, hydrophobic interaction 
chromatography, affinity chromatography, hydroxylapatite 
chromatography and lectin chromatography. Protein refolding 
steps can be used, as necessary, in completing configuration 
of the mature protein. Finally, high performance liquid 
chromatography (HPLC) can be enployed for final purification 
steps . 

The polynucleotides of the present invention may have 
the coding sequence fused in frame to a marker sequence which 
allows for purification of the polypeptide of the present 
invention. An example of a marker sequence is a 
hexahistidine tag which may be supplied by a vector, 
preferably a pQE-9 vector, which provides for purification of 
the polypeptide fused to the marker in the case of a 
bacterial host, or, for example, the marker sequence may be 
a hemagglutinin (HA) tag when a mammalian host, e.g. COS -7 
cells, is used. The HA tag corresponds to an epitope derived 
from the influenza hemagglutinin protein (Wilson, I., et al.. 
Cell, 37t767 (1984)) . 

The polypeptides of the present invention may be a 
naturally purified product, or a product of chemical 
synthetic procedures, or produced by recombinant techniques 
from a prokaryotic or eukaryotic host (for example, by 
bacterial, yeast, higher plant, insect and mammalian cells in 
culture) . Depending upon the host employed in a recoinbineuit 
production procedure, the polypeptides of the present 
invention may be glycosylated or may be non -glycosylated. 
Polypeptides of the invention may also include an initial 
methionine amino acid residue. 

BS61, and other breast specific genes, and the protein 
product thereof may be employed for early detection of breast 
cancer since they are over-expressed in the breast cancer 
state . 
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In accordance with another aspect of the present 
invention there are provided assays which may be used to 
screen for therapeutics to inhibit the action of the breast 
specific genes or breast specific proteins of the present 
invention. The present invention discloses methods for 
selecting a therapeutic which forms a complex with breast 
specific gene proteins with sufficient affinity to prevent 
their biological action. The methods include various assays, 
including competitive assays where the proteins are 
immobilized to a support, and are contacted with a natural 
substrate and a labeled therapeutic either simultaneously or 
in either consecutive order, and determining whether the 
therapeutic effectively competes with the natural substrate 
in a manner sufficient to prevent binding of the protein to 
its substrate. 

In another embodiment, the substrate is immobilized to 
a support, and is contacted with both a labeled breast 
specific polypeptide and a therapeutic (or unlcibeled proteins 
and a labeled therapeutic) , and it is determined whether the 
amount of the breast specific polypeptide bound to the 
substrate is reduced in comparison to the assay without the 
therapeutic added. The breast specific polypeptide may be 
labeled with antibodies. 

Potential therapeutic compounds include cuitibodies and 
anti- idiotypic antibodies as described above, or in some 
cases, an oligonucleotide, which binds to the polypeptide. 

Another example is an antisense construct prepared using 
antisense technology, which is directed to a breast specific 
polynucleotide to prevent transcription. Antisense 
technology can be used to control gene expression through 
triple-helix formation or antisense DNA or RNA, both of which 
methods are based on binding of a polynucleotide to DNA or 
RNA. For example, the 5' coding portion of the 
polynucleotide sequence, which encodes for the mature 
polypeptides of the present invention, is used to design an 
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antisense RNA oligonucleotide of from about 10 to 40 base 
pairs in length. A DNA oligonucleotide is designed to be 
coii?)lementary to a region of the gene involved in 
transcription (triple helix -see Lee et al., Nucl. Acids 
Res., 6:3073 (1979); Cooney et al, Science, 241:456 (1988); 
and Dervan et al., Science, 251: 1360 (1991)), thereby 
preventing transcription and the production of a breast 
specific polynucleotide. The antisense RMA oligonucleotide 
hybridizes to the mRNA in vivo and blocks translation of the 
mRNA molecule into the breast specific genes polypeptide 
(antisense - Okano, J. Neurochem., 56:560 (1991); 
Oligodeoxynucleotides as Antisense Inhibitors of Gene 
E3q)ression, CRC Press, Boca Raton, FL (1988))- The 
oligonucleotides described above can also be delivered to 
cells such that the antisense RNA or DNA may be expressed in 
vivo to inhibit production of the breast specific 
polypeptides . 

Another example is a small molecule which binds to and 
occupies the active site of the breast specific polypeptide 
thereby making the active site inaccessible to substrate such 
that normal biological activity is prevented. Examples of 
small molecules include but are not limited to small peptides 
or peptide-like molecules . 

These compounds may be employed to treat breast cancer, 
since they interact with the function of breast specific 
polypeptides in a manner sufficient to inhibit natural 
function which is necessary for the viability of breast 
cancer cells. This is true since the BSGs and their protein 
products are primarily expressed in breast cancer tissues and 
are, therefore, suspected of being critical to the formation 

of this state. 

The con?)Ounds may be ettqployed in a composition with a 
pharmaceutically acceptable carrier, e.g., as hereinafter 
described. 
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The cQinpounds of the present invention may be employed 
in combination with a suitable pharmaceutical carrier. Such 
compositions comprise a therapeutically effective amoxuit of 
the polypeptide, and a pharmaceutically acceptable carrier or 
excipient. Such a carrier includes but is not limited to 
saline, buffered saline, dextrose, water, glycerol, ethanol, 
and combinations thereof- The formulation should suit the 
mode of administration. 

The invention also provides a pharmaceutical pack or kit 
comprising one or more containers filled with one or more of 
the ingredients of the pharmaceutical compositions of the 
invention. Associated with such container (s) can be a notice 
in the form prescribed by a governmental agency regulating 
the manufacture, use or sale of pharmaceuticals or biological 
products, which notice reflects approval by the agency of 
manufacture, use or sale for human administration. In 
addition, the pharmaceutical coti^ositions may be employed in 
conjunction with other therapeutic compounds. 

The pharmaceutical cort?50sitions may be administered in 
a convenient manner such as by the oral, topical, 
intravenous , intraperitoneal , intramuscular , subcutaneous , 
intranasal, intra-anal or intradermal routes- The 
pharmaceutical conqpositions are administered in an amount 
which is effective for treating and/or prophylaxis of the 
specific indication- In general, they are administered in an 
amount of at least about 10 iig/)^s body weight and in most 
cases they will be administered in an amount not in excess of 
about 8 mg/Kg body weight per day. In most cases, the dosage 
is from about 10 /xg/kg to about 1 mg/kg body weight daily, 
taking into account the routes of administration, symptoms, 
etc. 

The breast specific genes and conqpounds which are 
polypeptides may also be eitqployed in accordance with the 
present invention by expression of such polypeptides in vivo, 
which is often referred to as "gene therapy." 
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Thus, for exanqple, cells from a patient may be 
engineered with a polynucleotide (DNA or RNA) encoding a 
polypeptide ex vivo, with the engineered cells then being 
provided to a patient to be treated with the polypeptide. 
Such methods are well-known in the art. For example, cells 
may be engineered by procedures known in the art by use of a 
retroviral particle containing RNA encoding a polypeptide of 
the present invention. 

Similarly, cells may be engineered in vivo for 
escpression of a polypeptide in vivo by, for exaii?>le, 
procedures known in the art. As known in the art, a producer 
cell for producing a retroviral peurticle containing RNA 
encoding a polypeptide of the present invention may be 
administered to a patient for engineering cells in vivo and 
expression of the polypeptide in vivo. These and other 
methods for administering a polypeptide of the present 
invention by such method should be apparent to those skilled 
in the art from the teachings of the present invention. For 
exan?)le, the expression vehicle for engineering cells may be 
other than a retrovirus, for example, an adenovirus which may 
be used to engineer cells in vivo after combination with a 
suitcdsle delivery vehicle. 

Retroviruses from which the retroviral plasmid vectors 
hereinabove mentioned may be derived include, but are not 
limited to, Moloney Murine Leukemia Virus, spleen necrosis 
virus, retroviruses such as Rous Sarcoma virus, Harvey 
Sarcoma Virus, avian leukosis virus, gibbon ape leukemia 
virus, human immunodeficiency virus, adenovirus. 
Myeloproliferative Sarcoma Virus, and mammary tumor virus. 
In one embodiment, the retroviral plasmid vector is derived 
from Moloney Murine Leukemia Virus. 

The vector includes one or more promoters. Suitable 
promoters which may be employed include, but are not limited 
to, the retroviral LTR; the SV40 promoter; and the human 
cytomegalovirus (CMV) promoter described in Miller, et al.. 
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«sn^P,.hnlaues. Vol. 7, No. 9. 980-990 (1989). or any other 
promoter (e.g., cellular promoters such as eukaryotic 
cellular promoters including, but not limited to, the 
histone, pol III, and ^-actin promoters). Other viral 
promoters which may be employed include, but are not limited 
to. adenovirus promoters, thymidine kinase (TK) promoters, 
and B19 parvovirus promoters. The selection of a suitable 
promoter will be apparent to those skilled in the art from 
the teachings contained herein. 

The nucleic acid sequence encoding the polypeptide of 
the present invention is under the control of a suitable 
promoter. Suitable promoters which may be employed include, 
but are not limited to. adenoviral promoters, such as the 
adenoviral major late promoter; or heterologous promoters, 
such as the cytomegalovirus (CMV) promoter; the respiratory 
syncytial virus (RSV) promoter; inducible promoters, such as 
the MMT promoter, the metallothionein promoter; heat shock 
promoters; the albumin promoter; the ApoAI promoter; human 
globin promoters; viral thymidine kinase promoters, such as 
the Herpes Simplex thymidine kinase promoter; retroviral LTRs 
(including the modified retroviral LTRs hereinabove 
described) ; the /3-actin promoter; and human growth hormone 
promoters. The promoter also may be the native promoter 
which controls the genes encoding the polypeptides. 

The retroviral plasmid vector is employed to transduce 
packaging cell lines to form producer cell lines. Examples 
of packaging cells which may be transfected include, but are 
not limited to, the PE501. PA317, ^-2. ^-AM. PA12. T19-14X. 
VT-19-17-H2, \^CRE. ^CRIP. GP+E-86. GP+envAml2. and DAN cell 
lines as described in Miller. Hnmr^n Genp Therapy. Vol. 1. 
pgs 5-14 (1990) . which is incorporated herein by reference 
in its entirety. The vector may transduce the packaging 
cells through any means known in the art. Such means 
include, but are not limited to. electroporation. the use of 
liposomes, and CaPO, precipitation, in one alternative, the 
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retroviral plasmid vector may be encapsulated into a 
liposome, or coupled to a lipid, and then administered to a 
host. 

The producer cell line generates infectious retroviral 
vector particles which include the nucleic acid sequence (s) 
encoding the polypeptides. Such retroviral vector particles 
then may be employed, to transduce eukaryotic cells, either 
in vitro or in vivo. The trsuisduced eukaryotic cells will 
express the nucleic acid sec[uence(s) encoding the 
polypeptide. Exikaryotic cells which may be transduced 
include, but are not limited to, embryonic stem cells, 
embryonic carcinoma cells, as well as hematopoietic stem 
cells, hepatocytes, fibroblasts, myoblasts, keratinocytes, 
endothelial cells, and bronchial epithelial cells. 

This invention is also related to the use of a breast 
specific genes of the present invention as a diagnostic. For 
exanple, some diseases result from inherited defective genes. 
The breast specific genes, CSG7 and CSGlO, for example, have 
been found to have a reduced expression in breast cancer 
cells as compared to that in normal cells. Further, the 
remaining breast specific genes of the present invention are 
overexpressed in breast cancer. Accordingly, a mutation in 
these genes allows a detection of breast disorders, for 
example, breast cancer. A mutation in a breast specific gene 
of the present invention at the DNA level may be detected by 
a variety of techniques. Nucleic acids used for diagnosis 
(genomic DNA, roRNA, etc.) may be obtained from a patient's 
cells, other than from the breast, such as from blood, urine, 
saliva, tissue biopsy and autopsy material. The genomic DNA 
may be used directly for detection or may be amplified 
enzymatically by using PGR (Saiki, et al.. Nature, 324:163- 
166 (1986)) prior to analysis. RNA or cDNA may also be used 
for the same purpose. As an example, PGR primers 
con5)lementary to the nucleic acid of the instant invention 
can be used to identify and analyze mutations in a breast 
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specific polynucleotide of the present invention. For 
example, deletions and insertions can be detected by a change 
in size of the amplified product in comparison to the normal 
genotype. Point mutations can be identified by hybridizing 
aitqplified DNA to radiolabelled breast specific RNA or, 
alternatively, radiolabelled antisense DNA sequences. 

Another well-established method for screening for 
mutations in particular segments of DNA after PC31 
amplification is single-strand conformation polymorphism 
(SSCP) analysis. PC31 products are prepared for SSCP by ten 
cycles of rearaplif ication to incorporate ^^P-dCTP, digested 
with an appropriate restriction enzyme to generate 200-300 bp 
fragments, and denatured by heating to 85 *C for 5 min. and 
then plunged into ice. Electrophoresis is then carried out 
in a nondenaturing gel (5% glycerol, 5% acrylamide) (Glavac, 
D. and Dean, M., Hxjman Mutation, 2:404-414 (1993)). 

Sequence differences between the reference gene and 
"mutants" may be revealed by the direct DNA sequencing 
method. In addition, cloned DNA segments may be used as 
probes to detect specific DNA segments. The sensitivity of 
this method is greatly enhanced when combined with PGR. For 
example, a sequencing primer is used with double -stranded POl 
product or a single -stranded tenqplate molecule generated by 
a modified PGR. The sequence determination is performed by 
conventional procedures with -radiolabeled nucleotides or by 
automatic sequencing procedures with fluorescent -tags . 

Genetic testing based on DNA sequence differences may be 
achieved by detection of alteration in electrophoretic 
mobility of DNA fragments and gels with or without denaturing 
agents- Small sequence deletions and insertions can be 
visualized by high-resolution gel electrophoresis. DNA 
fragments of different sequences may be distinguished on 
denaturing f ormamide gradient gels in which the mobilities of 
different DNA fragments aire retarded in the gel at different 
positions according to their specific melting or partial 
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melting temperatures (see, e.g., Myers, et al., Science , 
230:1242 (1985)). In addition, sequence alterations, in 
particular small deletions, may be detected as changes in the 
migration pattern of DNA. 

Sequence changes at specific locations may also be 
revealed by nuclease protection assays, such as Rnase and SI 
protection or the chemical cleavage method (e.g.. Cotton, et 

al., PNAS. USA . 85:4397-4401 (1985)). 

Thus, the detection of the specific DNA sequence may be 
achieved by methods such as hybridization, RNase protection, 
chemical cleavage, direct DNA sequencing, or the use of 
restriction enzymes (e.g.. Restriction Fragment Length 
Polymorphisms (RPLP) ) and Southern blotting. 

The sequences of the present invention are also valuable 
for chromosome identification. The sequence is specifically 
targeted to and can hybridize with a particular location on 
an individual human chromosome. Moreover, there is a current 
need for identifying particular sites on the chromosome. Pew 
chromosome marking reagents based on actual sequence data 
(repeat polymorphisms) are presently available for marking 
chromosomal location. The mapping of DNAs to chromosomes 
according to the present invention is an important first step 
in correlating those sequences with genes associated with 
disease . 

Briefly, sequences can be mapped to chromosomes by 
preparing PGR primers (preferably 15-25 bp) from the cDNA. 
computer analysis of the 3' untranslated region is used to 
rapidly select primers that do not span more than one exon in 
the genomic DNA, thus complicating the airqplif ication process. 
These primers are then used for PCR screening of somatic cell 
hybrids containing individual human chromosomes. Only those 
hybrids containing the human gene corresponding to the primer 
will yield an amplified fragment. 

PCR mapping of somatic cell hybrids is a rapid procedure 
for assigning a particular DNA to a particular chromosome. 
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Using the present invention with the same oligonucleotide 
primers, sublocalization can be achieved with panels of 
fragments from specific chromosomes or pools of large genomic 
clones in an analogous manner. Other mapping strategies that 
can similarly be used to map to its chromosome include in 
situ hybridization, prescreening with labeled flow-sorted 
chromosomes and preselection by hybridization to construct 
chromosome specif ic-cDNA libraries. 

Fluorescence in situ hybridization (FISH) of a cDNA 
clone to a metaphase chromosomal spread can be used to 
provide a precise chromosomal location in one step. This 
technique can be used with cDNA as short as 50 or 60 bases. 
For a review of this technique, see Verma et al.. Human 
Chromosomes: a Manual of Basic Techniques, Pergamon Press, 
New York (1988) . 

Once a sequence has been mapped to a precise chromosomal 
location, the physical position of the sequence on the 
chromosome can be correlated with genetic map data. Such 
data are found, for example, in V. McKusick, Mendelian 
Inheritance in Man (available on line through Johns Hopkins 
University Welch Medical Library) . The relationship between 
genes and diseases that have been mapped to the same 
chromosomal region are then identified through linkage 
analysis (coinheritance of physically adjacent genes) . 

Next, it is necessary to determine the differences in 
the cDNA or genomic sequence between affected and unaffected 
individuals. If a mutation is observed in som.e or all of the 
affected individxials but not in any normal individuals, then 
the mutation is likely to be the causative agent of the 
disease . 

with current resolution of physical mapping and genetic 
mapping techniques, a cDNA precisely localized to a 
chromosomal region associated with the disease ^could be one 
of between 50 and 500 potential causative genes- (This 
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assumes 1 megabase mapping resolution and one gene per 20 
kb) . 

The polypeptides, their fragments or other derivatives, 
or analogs thereof, or cells expressing them can be used as 
an immunogen to produce antibodies thereto. These antibodies 
can be, for example, polyclonal or monoclonal antibodies. 
The present invention also includes chimeric, single chain, 
and humanized antibodies, as well as Fab fragments, or the 
product of an Fab expression library. Various procedures 
known in the art may be used for the production of such 
antibodies and fragments. 

Antibodies generated against the polypeptides 
corresponding to a sequence of the present invention can be 
obtained by direct injection of the polypeptides into an 
animal or by administering the polypeptides to an animal, 
preferably a nonhuman. The antibody so obtained will then 
bind the polypeptides itself. In this manner, even a 
sequence encoding only a fragment of the polypeptides can be 
used to generate antibodies binding the whole native 
polypeptides. Such antibodies can then be used to isolate 
the polypeptide from tissue expressing that polypeptide. 

For preparation of monoclonal antibodies, any technique 
which provides antibodies produced by continuous cell line 
cultures can be used. Examples include the hybridoma 
technique (Kohler and Milsteiti, 1975, Nature, 256:495-497), 
the trioma technique, the human B-cell hybridoma technique 
(Kozbor et al., 1983, Immunology Today 4:72), and the EBV- 
hybridoma technique to produce human monoclonal antibodies 
(Cole, et al., 1985, in Monoclonal Antibodies and Cancer 
Therapy, Alan R. Liss, Inc., pp. 77-96). 

Techniques described for the production of single chain 
antibodies (U.S. Patent 4,946,778) can be adapted to produce 
single chain antibodies to immunogenic polypeptide products 
of this invention. Transgenic mice may also be used to 
generate antibodies. 
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The antibodies may also be employed to target breast 
cancer cells, for example, in a method of homing interaction 
agents which, when contacting breast cancer cells, destroy 
them. This is tirue since the antibodies are specific for the 
breast specific polypeptides of the present invention. A 
linking of the interaction agent to the antibody would cause 
the interaction agent to be carried directly to the breast. 

Antibodies of this type may also be used to do in vivo 
imaging, for exan?)le, by labeling the antibodies to 
facilitate scanning of the pelvic area and the breast. One 
method for imaging comprises contacting any cancer cells of 
the breast to be imaged with an anti -breast specific protein- 
antibody labeled with a detectable marker. The method is 
performed under conditions such that the labeled antibody 
binds to the breast specific polypeptides. In a specific 
exaitqple, the antibodies interact with the breast, for 
exanple, breast cancer cells, and fluoresce upon contact such 
that imaging and visibility of the breast are enhanced to 
allow a determination of the diseased or non-diseased state 
of the breast. 

The present invention will be further described with 
reference to the following examples,- however, it is to be 
understood that the present invention is not limited to such 
exanples. All parts or amounts, unless otherwise specified, 
are by weight. 

In order to facilitate understanding of the following 
examples certain frequently occurring methods and/or terms 
will be described. 

"Plasmids" are designated by a lower case p preceded 
and/or followed by capital letters and/or numbers. The 
starting plasmids herein are either commercially available, 
pxiblicly available on an unrestricted basis, or can be 
constructed from available plasmids in accord with published 
procedures. In addition, equivalent plasmids to those 
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described are known in the art and will be apparent to the 
ordinarily skilled artisan. 

"Digestion" of DNA refers to catalytic cleavage of the 
DNA with a restriction enzyme that acts only at certain 
sequences in the DNA. The various restriction enzymes used 
herein are commercially available and their reaction 
conditions, cof actors and other requirements were used as 
would be known to the ordinarily skilled artisan. For 
analytical purposes, typically l m9 of plasmid or DNA 
fragment is used with about 2 units of enzyme in about 20 fil 
of buffer solution. For the purpose of isolating DNA 
fragments for plasmid construction, typically 5 to 50 ng of 
DNA are digested with 20 to 250 \mits of enzyme in a larger 
volume. Appropriate buffers and substrate amounts for 
particular restriction enzymes are specified by the 
manufacturer. Incubation times of about l hour at 37 'C are 
ordinarily used, but may vary in accordance with the 
supplier's instructions. After digestion the reaction is 
electrophoresed directly on a polyacrylamide gel to isolate 
the desired fragment. 

Size separation of the cleaved fragments is performed 
using 1 percent TAB agarose gel described by Sambrook, et 
al., "Molecular Cloning: A Laboratory Manual" Cold Spring 
Laboratory Press, (1989) . 

"Oligonucleotides" refers to either a single stranded 
polydeoxynucleotide or two complementary polydeoxynucleotide 
strands which may be chemically synthesized. Such synthetic 
oligonucleotides have no 5' phosphate and thus will not 
ligate to another oligonucleotide without adding a phosphate 
with an ATP in the presence of a kinase. A synthetic 
oligonucleotide will ligate to a fragment that has not been 
dephosphorylat ed . 

"Ligation" refers to the process of forming 
phosphodiester bonds between t%iro double stranded nucleic acid 
fragments (Maniatis, T., et al.. Id., p. 146). Unless 
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Otherwise provided, ligation may be accomplished using known 
buffers and conditions with 10 units of T4 DNA ligase 
("ligase") per 0.5 ng of approximately equimolar amounts of 
the DNA fragments to be ligated. 

unless otherwise stated, transformation was performed as 
described in the method of Graham, F. and Van der Eb, A., 
Virology, 52:456-457 (1973). 

Example 1 

np^^prmi nation of Transcription o f a breast specific gene 

To assess the presence or absence of active 
transcription of a breast specific gene RNA, approximately 6 
ml of venous blood is obtained with a standard venipuncture 
technique using heparinized tubes. Whole blood is mixed with 
an equal volume of phosphate buffered saline, which is then 
layered over 8 ml of Ficoll (Pharmacia. Uppsala, Sweden) in 
a 15 -ml polystyrene tube. The gradient is centrifuged at 
1800 X g for 20 min at 5°C. The lymphocyte and granulocyte 
layer (approximately 5 ml) is carefully aspirated and 
rediluted up to 50 ml with phosphate-buf f ered saline in a 50- 
ml tube, which is centrifuged again at IBOO X g for 20 min. 
at 5'C. The siq)ematant is discarded and the pellet 
containing nucleated cells is used for RNA extraction using 
the RNazole B method as described by the manufacturer (Tel- 
Test Inc., Friendswood, TX) , 

TO determine the quantity of mRNA from the gene of 
interest, a probe is designed with an identity to at least a 
portion of the mRNA sequence transcribed from a human gene 
whose coding portion includes a DNA sequence of Figures 1-20 
(SEQ ID NO: 1-20) . This probe is mixed with the extracted RNA 
and the mixed DNA and RNA are precipitated with ethanol -70'C 
for 15 minutes) . The pellet is resuspended in hybridization 
buffer and dissolved. The tubes containing the mixture are 
incubated in a 72 water bath for 10-15 mins. to denature 
the DNA. The tubes are rapidly transferred to a water bath 
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at the desired hybridization temperature. Hybridization 
temperature depends on the G + C content of the DNA. 
Hybridization is done for 3 hrs. 0.3 ml of nuclease-Sl 
buffer is added and mixed well. 50 ^1 of 4.0 M ammonium 
acetate and 0.1 M EDTA is added to stop the reaction. The 
mixture is extracted with phenol /chloroform and 20 /xg of 
carrier tRNA is added and precipitation is done with an equal 
volume of isopropanol. The precipitate is dissolved in 40 fil 
of TE (pH 7.4) and run on an alkaline agarose gel. Following 
electrophoresis, the RNA is microsequenced to confirm the 
nucleotide sequence. (See Pavaloro, J. et al.. Methods 
Bnzymol., 65:718 (1980) for a more detailed review). 

Two oligonucleotide primers are employed to anqplify the 
sequence isolated by the above methods. The 5' primer is 20 
nucleotides long and the 3' primer is a complimentary 
sequence for the 3' end of the isolated mRNA. The primers 
are custom designed according to the isolated mRNA. The 
reverse transcriptase reaction and PCR amplification are 
performed sequentially without interruption in a Perkin Elmer 
9600 PCR machine (Emeryville, CA) . Four hundred ng total RNA 
in 20 /xl diethylpyrocarbonate-treated water are placed in a 
65**C water bath for 5 min. and then quickly chilled on ice 
immediately prior to the addition of PCR reagents. The 50 -nl 
total PCR volume consisted of 2.5 luiits Taq polymerase 
(Perkin-Elmer) . 2 tmits avian myeloblastosis virus reverse 
transcriptase (Boehringer Mannheim, Indianapolis, IN); 200 /xM 
each of dCTP, dATP, dGTP and dTTP (Perkin Elmer) ; 18 pM each 
primer, 10 mM Tris-HCl; 50 mM KCl; and 2 mM MgClj (Perkin 
Elmer) . PCR conditions are as follows: cycle 1 is 42<*C for 
15 min then 97''C for 15 s (1 cycle) ; cycle 2 is 95«C for 1 
min. 60^C for 1 min, and 72''C for 30 s (15 cycles); cycle 3 
is 95*>C for 1 min. 60**C for 1 min., and 72*'C for 1 min. (10 
cycles); cycle 4 is 95'^C for 1 min., 60^C for 1 min., and 
72*'C for 2 min. (8 cycles) ; cycle 5 is 72*»C for 15 min. (1 
cycle) ; and the final cycle is a 4**C hold until sample is 
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taken out of the machine. The 50-^1 PCR products are 
concentrated down to 10 fil with vacuum centrifugation, and a 
sample is then run on a thin 1.2 % Tris -borate -EDTA agarose 
gel containing ethidium bromide. A band of expected size 
would indicate that this gene is present in the tissue 
assayed. The amount of RNA in the pellet may be quantified 
in numerous ways, for example, it may be weighed. 

verification of the nucleotide sequence of the PC!R 
products is done by microsequencing . The PGR product is 
purified with a Qiagen PGR Product Purification Kit (Qiagen, 
caiatsworth, CA) as described by the manufacturer. One iig of 
the PGR product undergoes PGR sequencing by using the Taq 
DyeDeoxy Terminator Cycle sequencing kit in a Perkin-Elmer 
9600 PGR machine as described by Applied Biosystems (Poster, 
CA) . The sequenced product is purified using Centri-Sep 
columns (Princeton Separations, Adelphia, NJ) as described by 
the company. This product is then analyzed with an ABI model 
373A DNA sequencing system (Applied Biosystems) integrated 
with a Macintosh Ilci cotiqputer. 

Rxanmle 2 

r.^.^.^i.^ Rvnression anri Purification of the BSC? Proteins and 
nae For Preparing a Monoclonal An tibody 

The DNA sequence encoding a polypeptide of the present 
invention, for this example BSGi, ATCC # 97175, is initially 
an5)lif ied using PGR oligonucleotide primers corresponding to 
the 5' sequences of the protein and the vector sequences 3' 
to the protein. Additional nucleotides corresponding to the 
DNA sequence are added to the 5' and 3' sequences 
respectively. The 5' oligonucleotide primer has the sequence 
5' GCCACCATGGATGTrrrGAAG 3' (SEQ ID NO: 21) and contains an 
Ncol restriction enzyme site followed by 15 nucleotides of 
coding sequence starting from the initial aminp acid of the 
processed protein. The 3' sequence 5' GGGCAGATCrGTCT 
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CCCCCACTCTGGGC 3' (SEQ ID NO: 22) and contains a complementary 
sequence to a Bglll restriction enzyme site and is followed 
by 18 nucleotides of the nucleic acid sequence encoding the 
protein. The restriction enzyme sites correspond to the 
restriction enzyme sites on a bacterial expression vector, 
pQE-60 (Qiagen, Inc. Chatsworth, CA) . pQE-60 encodes 
antibiotic resistance (Amp') , a bacterial origin of 
replication (ori) , an IPTG-regulatable promoter operator 
(P/O) , a ribosome binding site (RBS) , a 6-His tag and 
restriction enzyme sites. pQB-60 is then digested with Ncol 
and Bglll- The amplified sequences are ligated into pQB-60 
and inserted in frame with the sequence encoding for the 
histidine tag and the RBS. The ligation mixture is then used 
to transform an E . coli strain Ml5/rep 4 (Qiagen) by the 
procedure described in Sarabrook, J. et al., Molecular 
Cloning: A Laboratory Manual, Cold Spring Laboratory Press, 
(1989) . M15/rep4 contains multiple copies of the plasmid 
pREP4, which expresses the lad repressor and also confers 
kanamycin resistance (Kan') . Trans formants are identified by 
their ability to grow on LB plates and ampicillin/kanamycin 
resistant colonies are selected. Plasmid DNA is isolated and 
confirmed by restriction analysis. 

Clones containing the desired constructs are grown 
overnight (O/N) in liquid culture in LB media supplemented 
with both Amp (100 ug/ml) and Kan (25 ug/ml> . The O/N 
culture is used to inoculate a large culture at a ratio of 
1:100 to 1:250. The cells are grown to an optical density 
600 (CD.""*) of between 0.4 and 0.6. IPTG ( "Isopropyl-B-D- 
thiogalacto pyranoside") is then added to a final 
concentration of 1 mM. IPTG induces by inactivating the lad 
repressor, clearing the P/O leading to increased gene 
expression. Cells are grown an extra 3 to 4 hours. Cells 
are then harvested by centrifugation. The cell pellet is 
solubilized in the chaotropic agent 6 Molar Guanidine HCl. 
After clarification, solubilized protein is purified from 
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this solution by chromatography on a Nickel -Chelate colunm 
under conditions that allow for tight binding by proteins 
containing the 6 -His tag (Hochuli, E. et al., J. 
Chromatography 411:177-184 (1984) ) . BSGl protein (>90% pure) 
is eluted from the column in 6 molar guanidine HCl pH 5 . 0 and 
for the purpose of renaturation adjusted to 3 molar guanidine 
HCl, lOOmM sodium phosphate, 10 mmolar glutathione (reduced) 
and 2 mmolar glutathione (oxidized) . After incubation in 
this solution for 12 hours the protein is dialyzed to 10 
mmolar sodium phosphate. 

The protein purified in this manner may be used as an 
epitope to raise monoclonal antibodies specific to such 
protein. The monoclonal antibodies generated against the 
polypeptide the isolated protein can be obtained by direct 
injection of the polypeptides into an animal or by 
administering the polypeptides to an animal. The antibodies 
so obtained will then bind to the protein itself. Such 
antibodies can then be used to isolate the protein from 
tissue expressing that polypeptide by the use of an. for 
exanple, ELISA assay. 

Example 3 

PTP paration of cDNA Libraries f rom Breast Tissue 

Total cellular RNA is prepared from tissues by the 
guanidinium-phenol method as previously described (P. 
Chomczynski and N. Sacchi, Anal. Biochem. , 162: 156-159 
(1987)) using RNAzol (Cinna-Biotecx) . An additional ethanol 
precipitation of the RNA is included. Poly A mRNA is 
isolated from the total RNA using oligo dT-coated latex beads 
(Qiagen) . Two rounds of poly A selection are performed to 
ensure better separation from non-polyadenylated material 
when sufficient quantities of total RNA are available. 

The mRNA selected on the oligo dT is used for the 
synthesis of cDMA by a modification of the method of Gobbler 
and Hoffman (Gobbler. U. and B.J. Hoffman. 1983. Gene, 
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25:263) . The first Strand synthesis is performed using 
either Moloney murine sarcoma virus reverse transcriptase 
(Stratagene) or Superscript II (RNase H minus Moloney murine 
reverse transcriptase, 6ibco-BRL) . First strand synthesis is 
primed using a primer/linker containing an XJio I restriction 
site. The nucleotide mix used in the synthesis contains 
methylated dCTP to prevent restriction within the cDNA 
sequence. For second-strand synthesis E. coli polymerase 
Klenow fragment is used and t"Pl -dATP is incorporated as a 
tracer of nucleotide incorporation. 

Following 2nd strand synthesis, the cDNA is made blunt 
ended using either T4 DNA polymerase or Klenow fragment . Bco 
RI adapters are added to the cDKA and the cDNA is restricted 
with 3QiQ I. The cDNA is size fractionated over a Sephacryl 
S-500 column (Pharmacia) to remove excess linkers and cDNAs 
under approximately 500 base pairs. 

The cDNA is cloned unidirectionally into the Eco Rl-Xho 
I sites of either pBluescript II phagemid or lambda Ifiii-zap 
XR (Stratagene) . In the case of cloning into pBluescript II, 
the plasmids are electroporated into B-coli SDRE competent 
cells (Stratagene) . When the cDNA is cloned into imi-Zap XR 
it is packaged using the Gigipack II packaging extract 
(Stratagene) . The packaged phage is used to infect SURE 
cells and amplified. The pBluescript phagemid containing the 
CDNA inserts are excised from the lambda Zap phage using the 
helper phage BxAssist (Stratagene) . The rescued phagemid is 
plated on SOLR B.coli cells (Stratagene) . 
Preparation of Sequencing Tem plates 

Template DNA for sequencing is prepared by 1) a boiling 
method or 2) PCR amplification. 

The boiling method is a modification of the method of 
Holmes and Quigley (Holmes, D.S. and M. Quigley, 1981, Anal. 
Biochem., 114:193). Colonies from either cDNA cloned into 
Bluescript II or rescued Bluescript phagemid are grown in an 
enriched bacterial media overnight. 400 fil of cells are 
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centrifuged and resuspended in STET (O.IM NaCl, lOtriM TRIS Ph 
8.0, 1.0 inM EDTA and 5% Triton X-100) including lysozyme (80 
Hg/ml) and RNase A (4 ^g/ml) . Cells are boiled for 40 
seconds and centrifuged for 10 minutes. The supernatant is 
removed and the DNA is precipitated with PEG/NaCl and washed 
with 70% ethanol {2x) . Templates are resuspended in water at 
approximately 250 ng/fil. 

Preparation of teitplates by PGR is a modification of the 
method of Rosenthal et al. (Rosenthal, et al., Nucleic Acids 
Res., 1993, 21:173-174). Colonies containing cDNA cloned 
into pBluescript II or rescued pBluescript phagemid are grown 
overnight in LB containing ampicillin in a 96 well tissue 
culture plate. Two /xl of the cultures are used as template 
in a PCR reaction (Saiki, RK, et al.. Science, 2^9:487-493, 
1988; and Saiki, RK, et al.. Science, 230:1350-1354, 1985) 
using a tricine buffer system (Ponce and Micol., Nucleic 
Acids Res., 1992, 20:1992.) and 200 /xM dNTPs. The primer set 
chosen for amplification of the templates is outside of 
primer sites chosen for sequencing of the templates. The 
primers used are 5' -ATGCTTCCGGCTCGTATG-3 ' (SEQ ID NO: 23) 
which is 5' of the M13 reverse sequence in pBluescript and 
5'-GGGTrTTCCCAGTCAC6AC-3' (SEQ ID NO:24) which is 3' Of the 
Ml 3 forward primer in pBluescript. Any primers which 
correspond to the sequence flanking the M13 forward and 
reverse sequences can be * used. Perkin-Elmer 9600 
thermocyclers are used for anplification of the templates 
with the following cycler conditions: 5 min at 94**C (1 
cycle) ; (20 sec at 94«>C) ; 20 sec at 55^C (1 min at 72«C) (30 
cycles) ; 7 min at 72«C (1 cycle) . Following amplification 
the PCR templates are precipitated using PEG/NaCl cind washed 
three times with 70% ethanol. The templates are resuspended 
in water. 

Example 4 

Isolation of a Selected Clone From Breast Tissue 
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Two approaches are used to isolate a particular clone 
from a cDNA library prepared from human breast tissue. 

In the first, a clone is isolated directly by screening 
the library using an oligonucleotide probe. To isolate a 
particular clone, a specific oligonucleotide with 30-40 
nucleotides is synthesized using an Applied Biosystems DNA 
synthesizer according to one of the partial sequences 
described in this application • The oligonucleotide is 
labeled with ^^P- -ATP using T4 polynucleotide kinase and 
purified according to the standard protocol (Maniatis et al- , 
Molecular Cloning: A Laboratory Manual, Cold Spring Harbor 
Press, Cold Spring, NY, 1982). The Lambda cDNA library is 
plated on 1.5% agar plate to a density of 20,000-50,000 
pfu/150 mm plate. These plates are screened using Nylon 
membremes according to the standard phage screening protocol 
(Stratagene, 1993) . Specifically, the Nylon membrane with 
denatured and fixed phage DNA is prehybridized in 6 x SSC, 20 
mM NaH2P04, 0.4% SDS, 5 x Denhardt's 500 fig /vol denatured, 
sonicated salmon sperm DNA; and 6 x SSC, 0.1% SDS. After one 
hour of prehybridization, the membrane is hybridized with 
hybridization buffer 6 x SSC, 20 mM NaH2P04, 0.4% SDS, 500 
/zg/ml denatured, sonicated salmon sperm DNA with 1 x 10* 
cpm/ml ^^P -probe overnight at 42*^C. The membrane is washed at 
45-50«C with washing buffer 6 x SSC, 0.1% SDS for 20-30 
minutes dried and exposed to Kodak X-ray film overnight. 
Positive clones are isolated and purified by secondary and 
tertiary screening. The purified clone sequenced to verify 
its identity to the partial sequence described in this 
application. 

An alternative approach to screen the cDNA library 
prepared from human breast tissue is to prepare a DNA probe 
corresponding to the entire partial sequence. To prepare a 
probe, two oligonucleotide primers of 17-20 nucleotides 
derived from both ends of the partial sequence reported are 
synthesized and purified. These two oligonucleotides are 
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used to amplify the probe using the cDNA library template. 
The DNA template is prepared from the phage lysate of the 
CDNA library according to the standard phage DNA preparation 
protocol (Maniatis et al.) . The polymerase chain reaction is 
carried out in 25 fil reaction mixture with 0.5 fig of the 
above cDNA ten5)late. The reaction mixture is 1.5-5 mM MgClj, 
0.01% (w/v) gelatin, 20 fM each of dATP, dCTP, dGTP, dTTP, 25 
pmol of each primer and 0.25 Unit of Tag polymerase. Thirty 
five cycles of PCR (denaturation at 94 for 1 min; annealing 
at 55**C for 1 min; elongation at 72 for 1 min) are 
performed with the Perkin-Elmer Cetus automated thermal 
cycler. The amplified product is analyzed by agarose gel 
electrophoresis and the DNA band with expected molecular 
weight is excised and purified. The PCR product is verified 
to be the probe by subcloning euid sequencing the DNA product. 
The probe is labeled with the Multiprime DNA Labelling System 
(Amersham) at a specific activity < l x 10' drnp/fzg. This 
probe is used to screen the lambda cDNA library according to 
Stratagene's protocol. Hybridization is carried out with 5X 
TEN 920XTEN:0.3M Tris-HCl pH 8.0, 0 . 02M EDTA and 3MNaCl) , 5X 
Denhardt's, 0.5% sodium pyrophosphate, 0.1% SDS, 0.2 mg/ml 
heat denatured salmon sperm DNA and 1 x 10* cpm/ml of I^^Pl - 
labeled probe at 55*C for 12 hours. The filters are washed 
in 0.5X TEN at room temperature for 20-30 min., then at 55®C 
for 15 min. The filters are dried and autoradiographed at - 
70**C using Kodak X7^-5 film. The positive clones are 
purified by secondary and tertiary screening. The sequence 
of the isolated clone are verified by DNA sequencing. 

General procedures for obtaining conplete sequences from 
partial sequences described herein are summarized as follows; 
Procedure 1 

Selected human DNA from the partial sequence clone (the 
CDNA clone that was sequenced to give the partial sequence) 
is purified e.g. , by endonuclease digestion using Bco-Rl, gel 
electrophoresis, and isolation of the clone by removal from 
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low melting agarose gel. The isolated insert DNA, is 
radiolabeled e.g., with ^^P labels, preferably by nick 
translation or random primer IcUDeling. The labeled insert is 
used as a probe to screen a lambda phage cDNA library or a 
plasmid cDNA library. Colonies containing clones related to 
the probe cDNA are identified and purified by known 
purification methods. The ends of the newly purified clones 
are nucleotide sequenced to identify full length sequences - 
Complete sequencing of full length clones is then performed 
by Bxonuclease III digestion or primer walking. Northern 
blots of the mRNA from various tissues using at least part of 
the deposited clone from which the partial sequence is 
obtained as a probe can optionally be performed to check the 
size of the mRNA against that of the pxirported full length 
cDNA. 

The following procedures 2 and 3 can be used to obtain 
full length genes or full length coding portions of genes 
where a clone isolated from the deposited clone mixture does 
not contain a full length sequence. A library derived from 
human breast tissue or from the deposited clone mixture is 
also applicable to obtaining full length sequences from 
clones obtained from sources other than the deposited mixture 
by use of the partial sequences of the present invention. 

Procedure 2 

MICE Protocol For Recovery of Full -Length Genes 

Partial cDNA clones can be made full-length by utilizing 
the rapid amplification of cDNA ends (RACE) procedure 
described in Frohman, M.A., Dush, M.K. and Martin, G.R. 
(1988) Proc, Nat'l. Acad. Sci. USA, 85:8^^8-9002. A cDNA 
clone missing either the 5' or 3' end can be reconstructed to 
include the absent base pairs extending to the translational 
start or stop codon, respectively. In most cases, cDNAs are 
missing the start of translation therefor. The following 
briefly describes a modification of this original 5' RACE 
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procedure. Poly A+ or total RNA is reverse transcribed with 
Superscript II (Gibco/BRL) and an antisense or complementary 
primer specific to the cDNA sequence. The primer is removed 
from the reaction with a Microcon Concentrator (Amicon) . The 
first-strand cDNA is then tailed with dATP and terminal 
deoxynucleotide transferase (Gibco/BRL) . Thus, an anchor 
sequence is produced which is needed for PGR an?>lif ication. 
The second strand is synthesized from the dA-tail in PGR 
buffer, Taq DNA polymerase (Perkin-Elmer Getus) , an oligo-dT 
primer containing three adjacent restriction sites (3^1. 
Sai l and Clal) at the 5' end and a primer containing just 
these restriction sites. This double -stranded cDNA is PGR 
amplified for 40 cycles with the same primers as well as a 
nested cDNA- specific antisense primer. The PGR products are 
size-separated on an ethidium bromide -agarose gel and the 
region of gel containing cDNA products the predicted size of 
missing protein- coding DNA is removed. cDNA is purified from 
the agarose with the Magic PGR Prep kit (Promega) , 
restriction digested with Xhol or gall, and ligated to a 
plasmid such as pBluescript SKII (Stratagene) at Shol and 
EcoRV sites. This DNA is transformed into bacteria and the 
plasmid clones sequenced to identify the correct protein- 
coding inserts. Gorrect 5' ends are confirmed by comparing 
this sequence with the putatively identified homologue and 
overlap with the partial cDNA clone. 

Several quality-controlled kits are available for 
purchase. Similar reagents and methods to those above are 
supplied in kit form from Gibco/BRL. A second kit is 
available from Glontech which is a modification of a related 
technique, SLIG (single -stranded ligation to single-stranded 
cDNA) developed by Dumas et al. (Dumas, J.B., Bdwards, M. , 
Delort, J. and Mallet, Jr., 1991, Nucleic Acids Res., 
19:5227-5232). The major differences in procedure are that 
the RNA is alkaline hydrolyzed after reverse transcription 
and RNA ligase is used to join a restriction site-containing 
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anchor primer to the first -strand cDNA. This obviates the 
necessity for the dA-tailing reaction which results in a 
polyT stretch that is difficult to sequence past. 

An alternative to generating 5' cDNA from RNA is to use 
cDNA library doxible- stranded DMA. An asymmetric PC!R- 
an^lified antisense cDNA strand is synthesized with an 
antisense cDNA- specif ic primer and a plasmid-anchored primer. 
These primers are removed and a symmetric PC31 reaction is 
performed with a nested cDNA-specif ic antisense primer and 
the plasmid-anchored primer. 

Procedure 3 

RNA Ligase Protocol For Generating The 5' End Sequences To 
(ftitain Pull Length Genes 

Once a gene of interest is identified, several methods 
are available for the identification of the 5' or 3' portions 
of the gene which may not be present in the original 
deposited clone. These methods include but are not limited 
to filter probing, clone enrichment using specific probes and 
protocols similar and identical to 5' and 3' RACB. While the 
full length gene may be present in a library and can be 
identified by probing, a useful method for generating the 5' 
end is to use the existing sequence information from the 
original partial sequence to generate the missing 
information. A method similar to 5' RACE is available for 
generating the missing 5' end of a desired full-length gene. 
(This method was published by Fromont- Racine et al, Nucleic 
Acids Res., 21(7) :1683-1684 (1993). Briefly, a specific RNA 
oligonucleotide is ligated to the 5' ends of a population of 
RNA presumably containing full-length gene RNA transcript and 
a primer set containing a primer specific to the ligated RNA 
oligonucleotide. A primer specific to a known sequence (EST) 
of the gene of interest is used to PGR amplify the 5' portion 
of the desired full length gene which may then be sequenced 
and used to generate the full length gene. This method 
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Starts with total RNA isolated from the desired source, poly 
A RNA may be used but is not a prerequisite for this 
procedure. The RNA preparation may then be treated with 
phosphatase if necessary to eliminate 5' phosphate groups on 
degraded or damaged RNA which may interfere with the later 
RNA ligase step. The phosphatase if used is then inactivated 
and the RNA is treated with tobacco acid pyrophosphatase in 
order to remove the cap structure present at the 5' ends of 
messenger RNAs. This reaction leaves a 5' phosphate group at 
the 5' end of the cap-cleaved RNA which can then be ligated 
to an RNA oligonucleotide using T4 RNA ligase. This modified 
RNA preparation can then be used as a tentplate for first 
strand cDNA synthesis using a gene-specific oligonucleotide. 
The first stand synthesis reaction can then be used as a 
template for PGR amplification of the desired 5' end using a 
primer specific to the ligated RNA oligonucleotide and a 
primer specific to the known sequence (EST) of the gene of 
interest. The resultant product is then sequenced and 
analyzed to confirm that the 5' end sequence belongs to the 
partial sequence. 

Example 5 

Cloning and expression of BSGi using the baculovirus 
exoression svstem 

The DNA sequence encoding the full length BSGl protein, 
ATCC # 97175, was amplified using VCR oligonucleotide primers 
corresponding to the 5' and 3' sequences of the gene: 

The 5 ' primer has the sequence 5 ' AAAGGATCCCCCGCCATCATGG 
ATGTTTTCAAGAAG 3' (SEQ ID NO: 25) and contains a BamHI 
restriction enzyme site (in bold) followed by 8 nucleotides 
resembling an efficient signal for the initiation of 
translation in eukaryotic cells (Kozak, M., J. Mol. Biol., 
196:947-950 (1987) of the BSGl gene (the initiation codon for 
translation "ATG" is underlined) . 

The 3' primer has the sequence 5' AAATCTAGACTAGTCTCCCCC 
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ACTCTG 3' (SEQ ID NO: 26) and contains the cleavage site for 
the restriction endonuclease Xbal and 21 nucleotides 
complementary to the 3' sequence of the BSGl gene. The 
amplified sequences were isolated from a 1% agarose gel using 
a commercially available kit ("Geneclean, " BIO 101 Inc., La 
Jolla, Ca.). The fragment was then digested with the 
endonucleases BamHI and Xbal and then purified again on a 1% 
agarose gel. This fragment is designated P2. 

The vector pA2 (modification of pVL941 vector, discussed 
below) is used for the e3q)re8sion of the BSGl protein using 
the baculovirus expression system (for review see: Summers, 
M.D. and Smith, G.E. 1987, A manual of methods for 
baculovirus vectors and insect cell culture procedures, Texas 
Agricultural Eaqperimental Station Bulletin No. 1555). This 
egression vector contains the strong polyhedrin promoter of 
the Autographa califomica nuclear polyhedrosis virus 
(AcMNPV) followed by the recognition sites for the 
restriction endonucleases BamHI and Xbal. The 
polyadenylation site of the simian virus (SV)40 is used for 
efficient polyadenylation. For an easy selection of 
recombinant virus the beta-galactosidase gene from E.coli is 
inserted in the same orientation as the polyhedrin promoter 
followed by the polyadenylation signal of the polyhedrin 
gene. The polyhedrin sequences are flanked at both sides by 
viral sequences for the cell-mediated homologous 
recombination of co-transf ected wild- type viral DNA. Many 
other baculovirus vectors could be used in place of pA2 such 
as pRGl, pAc373, pVL94l and pAcIMl (Luckow, V.A. and Summers, 
M.D., Virology, 170:31-39). 

The plasmid was digested with the restriction enzymes 
BamHI and Xbal and dephosphorylated using calf intestinal 
phosphatase by procedures known in the art. The DNA was then 
isolated from a 1% agarose gel using the commercially 
available kit ("Geneclean" BIO 101 Inc., La Jolla, Ca.). 
This vector DNA is designated V2 . 
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Fragment P2 and the dephosphorylated plasmld pA2 were 
ligated with T4 DNA ligase. E.coli HBlOl cells were then 
transformed and bacteria identified that contained the 
plasmid (pBacBSGl) with the BSGl gene using the enzymes BamHI 
and Xbal. The sequence of the cloned fragment was confirmed 
by DNA sequencing. 

5 MS of the plasmid pBacBSGl was co-transf ected with 1.0 
/xg of a commercially available linearized baculovirus 
("BaculoGold™ baculovirus DNA", Pharmingen, San Diego, CA.) 
using the lipofection method (Feigner et al. Proc. Natl. 
Acad. Sci. USA, 84:7413-7417 (1987)). 

1/ig of BaculoGold™ virus DNA and 5 /xg of the plasmid 
pBacBSGl were mixed in a sterile well of a microtiter plate 
containing 50 /xl of serum free Grace's medium (Life 
Technologies Inc., (^aithersburg, MD) . Afterwards 10 fil 
Lipofectin plus 90 nl Grace's medium were added, mixed and 
incubated for 15 minutes at room temperature. Then the 
transfection mixture was added drop-wise to the Sf9 insect 
cells (ATCC CRL 1711) seeded in a 35 mm tissue culture plate 
with 1 ml Grace's medium without serxam. The plate was rocked 
back and forth to mix the newly added solution. The plate 
was then incubated for 5 hours at 27«>C. After 5 hoxxrs the 
transfection solution was removed from the plate cuid 1 ml of 
Grace's insect medixnn siqpplemented with 10% fetal calf serum 
was added. The plate was put back into an incubator and 
cultivation continued at 27*>C for four days. 

After four days the supernatant was collected and a 
plaque assay performed similar as described by Summers and 
Smith (supra) . As a modification an agarose gel with "Blue 
Gal" (Life Technologies Inc., Gaithersburg) was used which 
allows an easy isolation of blue stained plaques. (A 
detailed description of a "plaque assay" can also be found in 
the user's guide for insect cell culture and b^ulovirology 
distributed by Life Technologies Inc., Gaithersburg, page 9- 
10) . 
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Pour days after the serial dilution, the virus was added 
to the cells and blue stained plaques were picked with the 
tip of an Eppendorf pipette. The agar containing the 
recombinant viruses was then resuspended in an Eppendorf tube 
containing 200 nl of Grace's medium. The agar was removed by 
a brief centrif ugation and the supernatant containing the 
recombinant baculovirus was used to infect Sf9 cells seeded 
in 35 mm dishes. Pour days later the supematants of these 
culture dishes were harvested and thai stored at 

Sf9 cells were grown in Grace's medium supplemented with 
10% heat -inactivated PBS. The cells were infected with the 
recombinant baculovirus V-BSGl at a multiplicity of infection 
(MOD of 2. Six hours later the medium was removed and 
replaced with SP900 II medium minus methionine and cysteine 
(Life Technologies Inc . , Gaithersburg) . 42 hours later 5 /iCi 
of -methionine and 5 nC± cysteine (Amersham) were added. 
The cells were further incubated for 16 hours before they 
were harvested by centrifugation and the labelled proteins 
visualized by SDS-PAGE and autoradiography. 

Example 6 

Rv prgssion of Recombinan ^ BSGl in COS cells 

The e3q>ression of plasmid, BSGl HA is derived from a 
vector pcDNAI/Arap (Invitrogen) containing: 1) SV40 origin of 
replication, 2) ampicillin resistance gene, 3) E.coli 
replication origin, 4) CMV promoter followed by a polylinker 
region, an SV40 intron and polyadenylation site. A DNA 
fragment encoding the entire precursor and a HA tag fused in 
frame to its 3' end was cloned into the polylinker region of 
the vector, therefore, the recombinant protein expression is 
directed under the CMV promoter. The HA tag corresponds to 
an epitope derived from the influenza hemagglutinin protein 
as previously described (I. Wilson, H. Niraan, R. Heighten, A 
Cherenson, M. Connolly, and R. Lemer. 1984. Cell 37:767. 
(1984) ) . The infusion of HA tag to the target protein allows 
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easy detection of the recombinant protein with an antibody 
that recognizes the HA epitope. 

The plasmid construction strategy is described as 
follows : 

The DNA sequence encoding BSGl, ATCC # 97175, was 
constructed by PCR using two primers: the 5' primer AAAGGA 
TCCCCCGCCATCMSQATGTTrTCAAGAAG 3' (SEQ ID NO: 27) contains a 
BamHI site followed by 18 nucleotides of BSGl coding sequence 
starting from the initiation codon; the 3' sequence AAATC 
TAGACrAAAGCGTAGTCTGGGACXnX3GTATGGGTACT 

3' (SEQ ID NO: 28) contains complementary sequences to an Xbal 
site, translation stop codon, HA tag and the last 18 
nucleotides of the BamHI coding sequence (not including the 
stop codon) . Therefore, the ^CR product contains an BamHI 
site, BSGl coding sequence followed by HA tag fused in frame, 
a translation termination stop codon next to the HA tag, and 
an Xbal site. The PCR amplified DNA fragment and the vector, 
pcDNAI/Antp, were digested with BamHI and Xbal restriction 
enzyme and ligated. The ligation mixture was transformed into 
E. coli strain SURE (available from Stratagene Cloning 
Systems, 11099 North Torrey Pines Road, La Jolla, CA 92037) 
the transformed culture was plated on ampicillin media plates 
and resistant colonies were selected. Plasmid DNA was 
isolated from transformants and examined by restriction 
analysis for the presence of the correct fragment. For 
expression of the recombinant BSC protein, COS cells were 
transf ected with the expression vector by DEAE-DEXTRAN method 
(J. Sambrook, E. Fritsch, T. Maniatis, Molecular Cloning: A 
Laboratory Manual, Cold Spring Laboratory Press, (1989)). 
The expression of the BSG HA protein was detected by 
radiolabelling and immxinoprecipitation method (E. Harlow, D. 
Lane, Antibodies: A Laboratory Manual, Cold Spring Harbor 
Laboratory Press, (1988)). Cells were labelled for 8 hotirs 
with ^^S-cysteine two days post transf ection. Culture media 
was then collected and cells were lysed with detergent (RIPA 
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buffer (150 ItM NaCl, 1% NP-40, 0.1% SDS, 1% NP-40, 0.5% DOC, 
50inM Tris, pH 7.5) (Wilson, I. et al.. Id. 37:767 (1984)). 
Both cell lysate and culture media were precipitated with an 
HA specific monoclonal antibody. Proteins precipitated were 
analyzed on 15% SDS-PAGE gels. 

Numerous modifications and variations of the present 
invention are possible in light of the above teachings and, 
therefore, within the scope of the appended claims, the 
invention may be practiced otherwise than as particularly 
described. 
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WHAT IS r iATMED IS: 

1. An isolated polynucleotide conqprising a member 
selected from the group consisting of 

(a) a polynucleotide encoding the same 
polypeptide as the polynucleotide of Figure 1 (SBQ ID N0:1); 

(b) a polynucleotide encoding the same mature 
polypeptide as a human gene having a coding portion which 
includes DNA having at least a 90% identity to the DNA of one 
of Figures 2-20 (SEQ ID NO:2-20); 

(c) a polynucleotide which hybridizes to the 
polynucleotide of (a) and which has at least a 70% identity 
thereto; and 

(d) a polynucleotide encoding the same mature 
polypeptide as a human gene having a coding portion which 
includes DNA having at least a 90% identity to a DNA included 
in the deposited clone. 

2. The polynucleotide of Claim 1 wherein the huxnan 
gene includes DNA contained in the deposited clone* 

3 . The polynucleotide of Claim 1 wherein the member is 
a polynucleotide encoding the same polypeptide as the 
polynucleotide of Figure 1 (SEQ ID N0:1) • 

4. A vector containing the polynucleotide of claim 1. 

5. A host cell transformed or transfected with the 
vector of Claim 4 . 

6. A process for producing cells capable of e^ressing 
a polypeptide comprising genetically engineering cells with 
the vector of Claim 4. 
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7. A process for producing a polypeptide coinprising: 
expressing from the host cell of Claim 5 the polypeptide 
encoded by said polynucleotide. 

8 . A polypeptide conprising a member selected from the 
group consisting of: (i) a polypeptide encoded by a human 
gene, said human gene having a coding portion whose DNA has 
at least a 90% identity to the DNA of one of Figures 2-20 
(SBQ ID NO:2-20) ; (ii) a polypeptide having the deduced amino 
acid sequence as set forth in Figure 1 (SBQ ID N0:1) and 
fragments, analogs and derivatives thereof; and (iii) a 
polypeptide encoded by the human gene whose coding region 
includes a DNA having at least a 90% identity to the DNA 
contained in the deposited clone and fragments, analogs and 
derivatives of said polypeptide. 

9. The polypeptide of Claim 8 wherein the polypeptide 
has the deduced amino acid sequence as set forth in Figure 1 
(SEQ ID N0;1) • 

10. An antibody against the polypeptide of claim 8. 

11. A compound which inhibits activation of the 
polypeptide of claim 8. 

12 . A method for the treatment of a patient having need 
to inhibit a breast specific gene protein comprising: 
administering to the patient a therapeutically effective 
amoiint of the compound of Claim 11. 

13. The method of claim 12 wherein the conpound is a 
polypeptide and the therapeutically effective amount of the 
compound is administered by providing to the patient DNA 
encoding said polypeptide and expressing said polypeptide in 
vivo. 
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14 • A method for the treatment of a patient having need 

of a breast specific gene protein conprising: administering 
to the patient a therapeutically effective amoxmt of the 
polypeptide of claim 8. 

15. A process for diagnosing a disorder of the breast 
in a host con?>rising: 

determining transcription of a human gene in a 
sanple derived from non-breast tissue of a host, said gene 
having a coding portion which includes DNA having at least 
90% identity to DNA selected from the group consisting of the 
DNA of Figures 1-20 (SEQ ID N0:l-20), whereby said 
transcription indicates a disorder of the breast in the host. 

16. The process of claim 15 wherein transcription is 
determined by detecting the presence of an altered level of 
RNA transcribed from said human gene. 

17. The process of claim 15 wherein transcription is 
determined by detecting the presence of an altered level of 
DNA complementary to the RNA transcribed from said human 
gene. 

la. The process of claim 15 wherein transcription is 

determined by detecting the presence of an altered level of 
an expression product of said human gene. 

19 . A process for determining a disorder of a breast in 

a host con5>rising: 

contacting an antibody specific to a BSG antigen or 
an epitopic portion thereof, to a fluid san?)le derived from 
a host; 

determining the presence of an alter^ level of a 
BSG gene product in said sanple. 
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20. A process for Identifying antagonists to the 

polypeptide of claim 8 coir5)rising: 

contacting said polypeptide with a natural 
substrate cuid a labeled conpound to be screened either 
simultaneously or in either consecutive order; and 

determining whether the therapeutic effectively 
competes with the natural substrate in a manner sufficient to 
prevent binding of the protein to its substrate. 
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